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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 



CROSS-REFERENCES TO RELATED APPLICATIONS 

This application is related to USSN 60/284,770. filed April 1 8, 200 1 ; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
1 0 USSN 60/xxx,xxx, filed April 12, 2002 (pocket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
15 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer; 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention fijrther relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and women. 
In fact, lung cancer accoimts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting &om smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworthy substances include nitrogen oxides, 

30 hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compoimds such as nicotine, tobacco alkaloids 
(nomicotine, anatabine, anabasine), polycycUc aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosamines are foimed during tobacco curing and processing, and 
are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is ^pUed, the tobacco-specific mtrosamine known as NNK produces lung adenomas 
and limg adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
5 lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhaling 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 
each year. 

In addition to smoking, other factors tiiought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uranium, exposure to chemical hazards 

15 sudi as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic factors, and diet. 

Histological classification of various lung cancers define the types of cancer that 
begui m the lung. See, e.g., Travis, et al. (1999) Histological Typing of Lung and Pleural 
TiiTnnurs (International Histological Classification of Tumours, No 1- Four major cell types 

20 make up more than 88% of all primary Ivmg neoplasms. These are: squamous or epidemioid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder iaclude undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to therapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accounts for 1 8-25% of all lung cancers, and occurs 
less fi«quentiy than non-small cell lung cancers, and generally spread to distant organs more 
n^jidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have already spread beyond the beyond the bounds where surgery and curative intent 

30 can be undertaken. Hoever, if identified early enough, these cancers are oflien responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more firequently occurring form of lung 
cancer- They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and accoimt for more than 75% of all lung cancers. Non-small cell tumors that are localized 
at the tune of presentation can sometimes be cm-ed with surgery and/or radiotherapy, but 
usually are not identified until significant metastasis has occurred, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment.. 

The screening of asymptomatic persons at high risk for lung cancer has often proven 
ineffective. In general, only 5 to 15 percent of lung cancer patients have then disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread from the limg. Lung cancer often 
spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofv effective curative treatments, early detection does not necessarily alter 
the total death rate fiom lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment pf 
lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of limg cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in limg disease and other metastatic cancers. 

SUMMARY OF THE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer ceUs. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 
antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 
selecting therapy, determining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstructive pulmonary disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial pulmonary fibrosis (IPF)j asthma, and 
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bronchiectasis. Other aspects of the invention will become {^parent to the skilled artisan by 

the following description of the invaition. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell fix>m a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1 A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, tiie polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables lA-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables lA-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid surface. In one embodiment, 
15 the patient is undergoing a therapeutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the method further comprises the step of amplifying nucleic acids 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample fix>m a patient undergoing the there^utic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the Ivmg cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
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biological sample fmm a patient undergoing the tiierapeutic treatment; and (ii) determining 

the level of a limg cancer-associated antibody in the biological sample by contacting the 

biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 

to a sequence at least 80% identical to a sequence as shown in Tables .lA-16, wherein the 

5 polypeptide specijacally binds to the lung cancer-associated antibody, thereby monitoring the 

efficacy of the then^y. 

In one embodiment, the me&od fiirther comprises flie step of: (iii) comparing the 

level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 

In a biological sample from the patient prior to, or earlier in, the therapeutic treatment. 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of limg cancer, the method comprising the steps of: (i) providing a 
biological sample from apatimt undergoing the ther^eutic treatment; and (ii) determining 
the level of a limg cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

1 5 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables lA-16, thereby monitoring the efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sanaple from the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables lA-16. In one embodiment, an 
e^qiression vector or cell comprises the isolated nucleic acid. In one aspect, the present 
invention provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables lA-16. 

25 111 another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment,' the antibody is conjugated to an 
effector con^ionent, e.g., a fluorescent label, a radioisotope or a cjrtotoxic chemical. In one 
embodiment, the antibody is an antibody fragment In another embodiment, the antibody is 

30 hinnanized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample from the patient with an 
antibody or protein as described hereiiL 
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In another aspect, the present invention provides a method of detecting antibodies 
specific to a lung cancer gene in a patient, the method comprising contacting a biological 
sample from the patient with a polypeptide encoded by a nucleic acid comprises a sequence 
from Tables lA-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a limg cancer-associated polypeptide, the method comprising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1 A-16; and (ii) determining the functional effect of the 

1 0 compovmd upon the polypeptide. 

In one embodiment, the ftmctional effect is a physical effect, an enzymatic effect, or a 
chemical effect, hi one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant. In one 
embodiment, the functional effect is determined by measuring hgand bmding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat limg cancer in a patient, the 
method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. In one embodiment, the compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: (i) administering a test compound to a mammal having limg cancer or a cell 
isolated therefix)m; (ii) comparing the level of gene expression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 lA-16 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound that modulates the level of »q)ression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a cell therefi»m that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-malignant lung disease. 

In another aspect, the present invention provides a method for treating a manunal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 
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In another aspect, the present invention provides a pharmaceutical composition for 

treating a mammal having lung cancer, the composition comprising a compound identified by 

the assay described herein and a physiologically acceptable excipient. 

5 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. "Treatment, monitoring, detection 
or modulation of lung disease or cancer" includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether malignant or 
non-mialignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with limg cancers 
in which gene expression fijom a gene in Tables 1 A-I6 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily firom lung cancer samples, these same targets are likely to be similarly found in , 

15 analyses of other medical conditions. These other conditions may result fi-om similar 
pathological processes which affect similar tissues, e.g., limg cancer, small cell lung 
• carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IPF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSIP)), chronic obstructive puhnonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the NCI webpage and USSN 60/347,349 and USSN 60/xxx,xxx (docket LFBR-OOl-lP, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself, or treatinent of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therapeutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be usefiil to selectively identify those 
markers. For example, therapeutic methods may take tiie form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of fiiuction (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be usefiil for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very different treatments. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintenance. Metastatic processes or characteristics may 
also be targeted. Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to detemiine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PGR or hybridization techniques, or protein, e.g., ELISA, imaging, 
IHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables lA-16 provide imigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of die unigene cluster. In Table lA, genes marked as **target 1" or **target 2" are 
particularly useful as therapeutic targets. Genes marked as 'target 3" are particularly useful 
as diagnostic markers. Genes marked as "chron" are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was determined using the 70th percentile of 
chronically diseases limg samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tumor samples 
divided by the 90th percentile of normal limg samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and int^pecies homologs that: (1) have a nucleotide sequence that has greats than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
lA-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
unigene cluster of Tables lA-16, and conservatively modified variants thereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucleic acid sequence, or 
the complement thereof of Tables lA-16 and conservatively modified variants thoreof; or (4) 

8 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 

65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 

or 99% or greater amino sequence identity, preferably over a region of over a region of at 

least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 

5 ^coded by a nucleotide sequence of or associated with a unigene cluster of Tables 1 A-16. A 

polynucleotide or polypeptide sequence is typically from a mammal including, but not 

limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 

other mammal. A "lung cancer polypeptide" and a "lung cancer polynucleotide," include 

both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contaiaed in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "fiill length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a limg cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells &om an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatment or outcome 
history, will be particularly useful. 

The t^ms "Identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences tiiat are the 
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same or have a specified percentage of amino acid residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for mavimum correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be applied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotide in length: 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When usuig a sequence comparison algorithm, test and 
refereace sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes refiarence to a segment of 
contiguous positions selected firom the group consisting typically of firom 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same nvmiber of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (198 1) Adv. Apol. Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 

30 48:443, by the search for similarity method of Pearson and Lipman (1 988) Proc. Nat'l. Acad. 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WT), or by manual alignment and visual inspection (see, 

e.g., Ausubel, et aL (eds. 1995 and supplements) Current Protocols in Molecular Biology . 

Preferred examples of algorithms that are suitable for determining percent sequence 

identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 

5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 

J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 

herein, to determine percent sequence identity for the nucleic acids and proteins of the 

invention. Software for performing BLAST analyses is publicly available through flie 

National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves jBrst identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive- valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing, 

15 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls off by the 

quantity X from its maximimi achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignm^ts; or the end of 
eittier sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlength (W) of 1 1 , an expectation (E) of 1 0, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) aUgnments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of bofli strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm \s the smallest simi 
probability (P(N[))» which provides an indication of the probability by which a match between 
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two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest svun probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 
10 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to amplify the 
sequences. 

15 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the repUcation or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
piokaiyotic cells such as E. coli, or eukaiyotic ceUs such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms **isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially firee firom components that normally accompany it as found in its 
native state. Purity and homogeneity are typically deteraiined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance hquid 

25 chromatogr^hy. A protein or nucleic acid that is the predominant species present in a 

prqparation is substantially purified. In particular, an isolated nucleic acid is separated firom 
some open reading fiiames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. "Purify" or "purification" in other embodiments 
means removing at least one contaminant or component fi-om the composition to b6 purified. 
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Ih this sense, purification does not require that the purified compound be homogeneous, e.g., 
100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
5 one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino add, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as weU 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
boiuid to a hydrogoi, a carboxyl group, an amino gmup, and an R group, e.g., homoserine, 

15 norleucine, methionine sulfoxide, methionine methyl sulfoniimi. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retatn some basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refer to 
chemical compoimds that have a structure that is different from the general chemical 
structure of an amino acid, but that fimction similarly to another amino acid. 

20 Amino acids may be referred to herein by either their commonly known three letter 

symbols or by the one-letter symbols recommended by the lUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 

25 sequences. With respect to particular nucleic acid sequences, conservatively modified 

variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of fimctionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Every nucleic acid sequence Herein which encodes a 

polypeptide also describes silent variations of the nucleic acid. In certain contexts each 

codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 

TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 

encodes a polypeptide is implicit in a described sequence with respect to the expression 

product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 

deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

1 5 homologs, and alleles of the inv^tion. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic ^id (D), Glutamic acid (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (J), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3"* ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biological M acromolecules . 'Trimary 
structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are corhmonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of p-sheet 
and a-heUces. 'Tertiary stmcture" refers to the complete three dimensional structure of a 

30 polypeptide monomer. "Quaternary structiure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independ^t tertiary units. Anisotropic 
terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 

used herein means at least two nucleotides covalentiy linked together. OUgonucleotides are 

typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 

to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 

5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 

etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 

although in some cases, nucleic acid analogs are included that may have at least one different 

linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- 

methylphophoroamidite linkages (see Eckstein (1992) Oligonucleotides and Analogues: A 

10 Practical Approach Oxford University Press); and peptide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-iibose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohvdrate 
Modifications in Antisense Research. ASC Symposium Series 580. Nucleic acids containing 

15 one or more carbocycUc sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (TnO for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C drop in Tm for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to tiieir non-ionic nature, hybridization of die bases attached to these 
backbones is relatively insensitive to salt concentration, in addition, PNAs are not degraded 

30 by cellular en2ymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of tiie complementary 

15 
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Strand; thus the sequences described herein also provide the complement of the sequence. 

The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 

nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 

of bases, including uracil, adenine, thymine, c>tosine, guanine, inosine, xanthine 

5 hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a naturally 

occiuring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 

'Nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 

nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 

naturally occiuring analog structures. Thus, e.g., the individual units of a peptide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, useful labels include ^^P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 

15 or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 144:945; David, et al. (1974) Biochemistry 

20 13:1014-1021; Pain, et al. (1981) J. Immunol. Metfa.. 40:219-230; and Nygren (1982) JL 
gistochem. and Cytochem. 30:407-412. 

An "effector" or "effector moiet/' or "effector component" is a molecule that is 
boimd (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, throug^i ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector'' can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 

Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, metiiod 
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using high afiBnity interactions may achieve the same results where one of a pair of binding 

partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is a nucleic acid capable of 

binding to a target nucleic acid of complementary sequence through one or more types of 

5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 

formatioiL As used herein, a probe may include natural (i.e.. A, G, C, or T) or modified bases 

(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 

linkage other than a phosphodiester bond, preferably one that does not functionally interfere 

with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 

10 bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, cfaromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind. By assaying for the presence or 

1 5 absence of the probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of 
RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
• acid or protein, or that the cell is derived firom a cell so modified. Thus, e.g., recombinant 
cells express genes tiiat are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term '^recombinant nucleic acid" herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by Hgating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 

30 xmderstood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will repUcate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
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recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 

made using recombinant techniques, i.e., through the expression of a recombinant nucleic 

acid as depicted above. 

The term 'Tieterologous" when used with reference to portions of a nucleic acid 

5 indicates that the nucleic acid comprises two or more subsequences that are not normally 

foimd in the same relationship to each other in nature. For instance, the nucleic acid is 

typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 

arranged to make a new functional nucleic acid, e.g., a promoter firom one source and a 

coding region from another source. Similarly, a heterologous protein will oftai refer to two 

10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 

a fiision protein). 

A "promoter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase U type 

15 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription fector binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 

particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, tiie e^qiression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the bindiiig, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
, total cellular or library DNA or RNA). 
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The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
5 higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 
(1993) Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes (vol. 24) Elsevier, Generally, stringMit conditions are selected to be about 5-1 0" C 
lower than the thermal melting point (TnO for the specific sequence at a defined ionic strength 

10 pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the taiget sequence at 
equiUbrium (as the target sequences are present in excess, at Tn,, 50% of the probes are 
occupied at equiHbrium). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

1 5 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as fonnamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times backjground hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42" C, or, 5x SSC, 1% SDS, incubating at 65** C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PGR, a temperature of about 36" C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32** C and 48* C 
depending on primer length. For high stringency PGR amplification, a temperature of about 

25 62° C is typical, although high stringency annealing temperatures can range from about 50° C 
to about 65° C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency ampUfications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min, Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Innis, et al.(1990) PGR Protocols. A Guide to Methods and 
Applications. 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 
permitted by the genetic code. In such cas^, the nucleic acids typically hybridize under 
moderately stringent hybridization conditions. Exemplary **moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
5 1% SDS at 37** C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 
background. Alternative hybridization and wash conditions can be utiUzed to provide 
conditions of similar stringency. Additional guidelines for determining hybridization 
parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 
Molecular Bioloefv Lippincott. 

1 0 The phrase "functional effects" in the context of assays for testing compounc^ that 

modulate activity of a limg cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, fimctional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity, cell viability, cell growth on soft 

IS agar; anchorage dependence; contact inhibition and density limitation of growth; cellular 
prolifCTation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metz^tasis, and other characteristics of lung cancer 
cells. 'Tunctional effects" include in vitro, in vivo, and ex vivo activities. 

20 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, mzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refiactive index), hydrodynamic (e.g., shape), chromatographic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the lung cancer 
p^)teiI:^ measuring binding activity or binding assays, e.g., binding to antibodies or other 
Ugands, and measuring cellular proliferation. Determination of the functional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an m vitro assays, e.g., cell growth on soft agar; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mKNA and protein 
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e;q)ression in cells undergoing metastasis, and otiier characteristics of lung cancer cells. The 

functional effects can be evaluated by many means known to those skilled in the art, e.g., 

microscopy for quantitative or qualitative measures of alterations in morphological features, 

measurement of changes in RNA or protein levels for lung cancer-associated sequences, 

5 measurement of RNA stability, identification of downstream or reporter gene expression 

(CAT, luciferase, p-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 

colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of lung cancer polynucleotide and 

polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 

polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or e^qiression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

15 "Activators" are compounds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of Ixmg cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., e:q>ressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compoimd and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables 1 A- 1 6. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 1 00%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activatoi^) is 110%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 

1000-3000% higher. 

The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 

5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 

density limitation of growth, loss of growth factor or serum requirements, changes in cell 

morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney (1994) Culture of A nimal Cells a Manual of 

10 Basic Technique pp. 231-241 (3"^ ed.). 

'*Tumor cell" refers to precancerous, cancerous, and normal ceUs in a tumor. 
"Cancer cells," "transformed" cells, or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic materiaL Although transformation can arise &om infection with a transforming virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) CiiltiiTft nf Ani mal Cells a Manual of Basic Technique (3"" ed.)). 

20 "Antibody" refers to a polypeptide comprising a framework region firom an 

immunoglobulin gene or fiiagments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in tum define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

Pimrfamental Tmrniinnlnfry. 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 
30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 1 10 or more amino acids primarily responsible 
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for antigen recognition. The terms variable light chain (Vl) and variable heavy chain (V h) 
refer to these ligiht and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a nxmiber of well-characterized 
fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 
5 antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a dimer of Fab 
which itself is a light chain joined to Vh-Ch1 by a disulfide bond. The F(ab)'2 may be 
reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 
converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1999) Fundamental Immunology (4th ed.). While 
10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 



1 5 identified using phage display libraries (see, e.g., McCafferty, et al. (1990) Nature 348:552- 
554). 



antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunoloev Todav 4:72; Cole, et al. 
20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Theranv : Coligan (1 991 and 

supplements) Current Protocols in Immunologv: Harlow and Lane (1988) Antibodies, A 
Laboratory Manual : and Coding (1986) Monoclonal Antibodies: Principles and Practice (2d 




ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 
25 other organisms such as other mammals, may be used to express humanized antibodies. 

Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779-783). 



30 or a portion thereof, is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
function, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 



For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 




A "chimeric antibody" is an antibody molecule in which, e.g, (a) the constant region, 
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variable region, or a portion thereof, is altered, replaced, or exchanged with a variable region 

having a different or altered antigen specificity. 



Identification of lung cancer-associated sequences 
5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly e:q}ressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, nonnal tissue may be distinguished fi^om 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue jfrom surviving cancer patients. By comparing expression profiles of tissue in 
knoAvn different lung cancer states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. 

1 S Molecular profiling may distinguish subtypes of a currently coUechve disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of tibis information in a nimiber of ways. For exanople, 
a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatment stq> may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient samples with the known expression profiles. MaUgnant diseasemay be 
compared to non-maUgnant conditions. Metastatic tissue can also be analj^ed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis firom a 

remote primary site. Furfliermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to mimicking or altering a particular expression 
profile; e.g., screening can be done for drugs that suppress the lung cancer e:!q>ression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PGR methods may be appUed with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administCTed for gene therapy purposes, including the 
administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 
and other modulators thereof) administered as therapeutic drugs or as protein or DNA 
vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in lung cancer relative to normal tissues and/or non-malignant Ivmg 
disease, or in different types of lung disease, hereiu termed "lung cancer sequences." As 
outlined below, lung cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those that are down-regulated (i.e., expressed at a 

10 lower level). In a preferred embodiment, the Ixmg cancer sequences are from humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other lung 
cancer sequences are provided, from vertebrates, including mammals, including rodents (rats, 
mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 

15 horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences fixjm other organisms inay be 
obtained using the techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fully outlined below, lung cancer nucleic 
. acid sequences are useful ia a variety of applications, including diagnostic £q)plications, 

20 which will detect naturally occurring nucleic acids, as well as screening spplications; e.g., 
biochips comprising nucleic acid probes or PGR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequaices, the lung cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples from patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples from other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian. 
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etc. Samples of, non metastatic disease tissue and tissue undergoing metastasis are applied to 

biochips comprising nucleic acid probes. The samples are first microdissected, if qjplicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., from Affymetrix, Santa Clara, CA. Gene expression profiles as 
5 described herein are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. Li a preferred 
1 0 embodiment, those genes identified during the lung cancer screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removed from the profile, 
although in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). That is, when screening for drugs, it is usually preferable that the 
target expression be disease specific, to minimize possible side effects on other organs. 
IS In a preferred embodimoit, lung cancer sequences are those that are up-regulated in 

lung cancer; that is, the expression of these genes is higher in cancerous tissue than ui normal 
limg or other tissue. "Up-regulation" as used herein means, when the ratio is presented as a 
niunber greater than one, that the ratio is greater than one, preferably 1.5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted bom genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:5 16-522). In other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

la another preferred embodiment, lung cancer sequences are those that are doym- 
regulated in the lung cancer, that is, the expression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. *T>own-regulation" as used herein means, when the ratio is 

presented as a numbCT greater than one, that the ratio is greater than one, preferably 1.5 or 

greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 

one, that the ratio is less tiian one, preferably 0.5 or less, more preferably 0.25 or less. 

5 

Informatics 

The ability to identify genes that are over or imder e7q)ressed in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with Ixmg cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceut ical Proteomics: Targets. 
Mechanism, and Function, paper presented at the IBC Proteomics conferraice, Coronado, CA 

1 5 (Jime 11 -12, 1 998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,81 1,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abimdance of a variety of molecular and macromolecular species firom a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of infonnation, which can be correlated with 
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padiological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although the data generated from the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processing using high-speed 
5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

1 0 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association withi one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

1 5 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,8 1 8 discloses a multi- 

20 dimensional database comprising a fimctionalily for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more tilian one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database stmcture in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinfonnatics; Duibin, et al. (eds., 1999) Biological 
Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (; Baxevanis and 
OeuUette (eds., 1998) Rininformatics: A Practica l Guide to the Analysis of Genes and 

30 Proteins^ : Rashidi and Buehler (1999) Bioinformatics: Basic Applicatio ns in Biological 
Science and Medicine : Setubal, et al. (eds 1997) Introduction to Comput ational Molecular 
Biology: Misener and Krawetz (eds, 2000) Bioinformatic s: Methods and Protocols; ffiggins 
and Taylor (eds., 2000) Bioinformati cs: Sequence. Structure, and Databanks: A Practical 
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A pproach: Brown (2001) Bioinformatics: A Biologist's Guide to Biocomputing and the 

Internet: Han and Kamber (2000) Data Mining: Concepts and Techniques (2000); and 

Waterman (1995) Introduction to Computational Biology: Maps. Sequences, and Genomes . 

The present invention provides a computer database comprising a computer and 

5 software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 

with data specifying the source of the target-containing sample from which each sequence 

specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 

is from a control tissue sample known to be free of pathological disorder!. In a variation, at 

10 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

1 5 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage ^paratus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattem in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell conq)rised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems bmlt therewith, comprising a bit pattem 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, tiie invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined firom a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 
(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
5 SunOS, Solaris, ADC, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (JQxed, 
Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 
in a file format suitable for retrieval and processing in a computerized sequence analysis, 
comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other suitable signal transmission medixmi, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired fcom an assay of the invention. 

1 5 The invention also provides a method for transmitting assay data that includes 

g^erating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a pluraUty of assay results obtained by the method of the invention. 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for aUgnment and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 

molecular biology software package (e.g., UWGCG Sequ^ce Analysis Software, Darwin); a 

data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 

SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 

5 be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 

device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
10 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular fimction and replication 

20 (including, e.g., signaling pathways); aberrant expression of such proteins often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed.. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular protems have enzymatic 
activity such as protein kinase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 

proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural moti& for which defined functions have been attributed. In 

30 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct firom SH2 
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domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 

few, have been shown to mediate protein-protein interactions. Some of these may also be 

involved in binding to phospholipids or other second messengers. As will be appreciated by 

S one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

sequence; thus, an analysis of fhe sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 

One useful database is Pfam (protein families), which is a large collection of multiple 

sequence alignments and hidden Markov models covering many common protein domains. 

10 Versions are available via the intemet fi-om Washington University in St. Louis, the Sanger 

Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 

Nuc. Acids Res. 28:263-266; Sonnhanomer, et al. (1997) Proteins 28:405-420; Bateman, et al. 

(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 

322). 

15 In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the reenter molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain fix>m one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/tbreorane protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. HiereJfore, upon analysis of the amino acid sequence of a particular protein, the 

32 



wo 02/086443 PCT/IIS02/12476 
localization and number of transmembrane domains within the protein may be predicted (see, 

e.g., PSORT web site http://psort.nibb,ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 

motifs are found repeatedly among various extracellular domains. Conserved structure 

5 and/or functions have been ascribed to different extracellular motifs. Many extracellular 

domains are involved in binding to other molecules. In one aspect, extracellular domains are 

found on receptors. Factors that bind the receptor domain include circulating ligands, which 

may be peptides, proteins, or small molecules such as adenosine and the like. For example, 

growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the Uke. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated Ugands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracelhilar domains may 

1 5 also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherqjeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also usefiil 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically peimeablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to expose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful limg markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an {^propriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets die molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 

(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 

proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 

distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 

5 to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 

glands, mammaxy glands, sax producing glands of the ear, etc.). Thus secreted molecules 

often find use in modulating or altering numerous aspects of physiology. Lung cancer 

proteins that are secreted proteins are particularly preferred in the present invention as they 

serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 

10 Those which are en2ymes may be antibody or small molecule targets. Others may be useful 

as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
15 acid and/or ainino acid sequence homology or linkage to the lung cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mENA are found on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

lA-16, can be firagments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning either longer sequences or the full length 
sequences; see Ausubel, et al., supra. Much can be done by informatics and many seqiiences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.nhn.nih.gov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancer nucleic acid coding regions or the 
entire mRNA sequence. Once isolated firom its natural source, e.g., contained within a 
plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be fiirther-used as a probe to identify and isolate 
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other Iving cancer nucleic acids, e.g., extended coding regions. It can also be used as a 

"precursor" nucleic acid to make modified or variant limg cancer nucleic acids and proteins. 

The lung cancer nucleic acids of the present invention are used in several ways. In a 

first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 

5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 

administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense applications. 

Alternatively, tiie lung cancer nucleic acids that include coding regions of lung cancer 

proteins can be put into expression vectors for the expression of limg cancer proteins, again 

for screening purposes or for administration to a patient. 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

15 hybridization of the target sequence and the probes of the present invention occurs. As 

outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by ^'substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined hereiiL 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequ^ce. In general, the nucleic acid probes range 
fi:om about 8 to about 100 bases long, with fix>m about 10 to about 80 bases being preferred, 
and firom about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, with eifh^ 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
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particular target. The probes ean be overlapping (i.e., have some sequence in common), or 
separate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
5 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable imder the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is typically meant one or more of 
electrostatic, hydrophiUc, and hydrophobic interactions. Included in non-covalent binding is 

10 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the soUd support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the solid support or pan 

15 be formed by a cross linker or by inclusion of a specific reactive group on either the soUd 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
^pieciated by those in the art. As described herein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be s^preciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
fimctionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes. Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, sihca or siUca-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US ^plication entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Sedal No. 09/270,214, filed March 15, 1999, herein incorporated by refereace in 

its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a prefenred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groins, carboxy groups, oxo groiqis and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
fiinctional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 

15 homo-or hetero-bifimctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkCTS, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oUgonucleotides are synthesized, and then attached to the surface 
20 of the solid support. Either the 5* or 3* terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-coValent. For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the Affymetrix GeneChip^M technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 
template in an amplification reaction (e.g.. Polymerase Chain Reaction, or PGR). In a 
quantitative amplification, the amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skiU in the art. Detailed protocols for quantitative 
PCR are provided, e.g., in Innis, et al. (1990) PGR Protocols. A Guide to Methods and 
Applications. 

In some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
dye and a 3* quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PCR product is ampUfied in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmphTaq, results in the 
cleavage of tiie TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 

15 quenching agent, thereby resulting in an increase in fluorescence as a fimction of 

amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkui-ehner.com). 

Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Bacringer, et al. (1990) Gene 89:117), transcription amplification 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1 1 73), self-sustained sequence 

repUcation (GuatelU, et al. (1990) Proc. Nat. Acad. Sci. USA 87: 1874), dot PCR, and hnker 
ad^terPCR,etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of e3q>ression vectors to express lung cancer protems 
which can then be used in screening assays, as described below. ETcpression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra, and Fernandez and Hoefiler (eds 1999) Gene Expression Systems'^ and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein, The tenn "control sequences" refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 

organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 

optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 

utilize promoters, polyadenylation signals, and enhancers. 

5 Nucleic acid is "operably linked*' when it is placed into a functional relationship with 

another nucleic acid sequence. For exan^le, DNA for a presequence or secretory leader is 

operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 

the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 

sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 

"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in , 

15 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more tiian one promoter, are also known in the art, and are useful in the 
preset invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, flius allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaiyotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contains at least one sequence homologoi^ to the host cell genome, and 
preferably two homologous sequences which flank the expression construct The integrating 



39 



wo 02/086443 PCT/US02/12476 
vector may be directed to a specific locus in the host cell by selecting the appropriate 

homologous sequaice for inclusion in the vector. Constructs for integrating vectors are well 

known in the art (e.g., Fernandez and Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 

5 marker gene to allow the selection of transformed host cells. Selection genes are well known 

in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by cultuiing a 

host cell transformed with an expression vector containing nucleic acid encoding a lung 

cancer protein, under the appropriate conditions to induce or cause expression of the lung 

10 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammaUan cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. coli. Bacillus subtilis, S© cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, Hel^ cells, HUVEC (human umbilical vein endotheUal cells), THPl cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are e3q)ressed in manunalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammaUan promoters are the promoters from 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the SV40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoeffler, supra). Typically, transcription termination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription terminator and polyadenylation signals include those 
derived form SV40. 
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The methods of introducing exogenous nucleic eicid into mammalian hosts, as well as 

othCT hosts, is well known in the art, and wiU vary with the host cell used. Techniques 

include dextran-mediated transfection, calcivun phosphate precipitation, polybrene mediated 

trausfection, protoplast fusion, electroporation, viral infection, encapsulation of the 

5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 

Promoters from bacteriophage may also be used and are known in the art. In addition, 

synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 

the trp and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growfli media (gram-positive bacteria) or into the 

1 5 periplasmic space, located between the irmer and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

>0 also include biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assembled into expression vectors. Expression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
coli. Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez and 
Hoeffler, supra). The bacterial expression vectors are transformed into bacterial host cells 

i5 usmg techniques weU known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 
\0 In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha. 
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Kluyveromyces fragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The lirng cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired epitope 

5 is small, the lung cancer protein may be fiised to a carrier protein to form an immunogen. 

Alternatively, the lung cancer protein may be made as a fiision protein to increase e>q)ression 

for affinity purification purposes, or for other reasons. For example, when the lung cancer 

protein is a lung cancer peptide, the nucldc acid encoding the peptide may be linked to other 

nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic,.molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer prote^in 

1 5 may be purified using a standard anti-lung cancer protem antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also usefvil. For 
general guidance in suitable purification techniques, see Scopes (1982) Protein Purification . 
The degree of purification necessary will vary depending on the use of the Ivmg cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, ther^eutic entities, far production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 

30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitutional, inseitioiial or deletional variants. These variants ordinarily are 

prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 

protein, using cassette or PGR mutagenesis or other techniques, to produce DNA encoding 

the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 

5 However, variant lung cancer protein fragments having up to about 100-1 50 residues may be 

prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 

predetermined nature of the variation, a feature that sets them apart from naturally occurring 

allelic or interspecies variation of the lung cancer protein amino acid sequence. The variants 

typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 althou^ variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

1 5 conducted at the target codon or region and the expressed lung cancer variants screened for 
the optimal combination of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., M13 primer 
mutagenesis and PGR mutagenesis. Screening of mutants is often done using assays of limg 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in tiie 
definition section. 

30 Variants typically exhibit essentially the same qualitative biological activity and will 

elicit the same immxme response as a naturally-occurring analog, although variants also are 
selected to modify the characteristics of limg cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 
protein is altered. For example, glycosylation sites may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included within the scope of 
this invention. One type of covalent modification includes reacting targeted amino acid 
5 residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a lung cancer 
polypeptide. Dedvatization with biflmctional agents is useful, for instance, for crosslinldng 
limg cancer polypeptides to a water-insoluble support matrix or surface for use in a metiiod 
for purifytng anti-lung cancer polypeptide antibodies or screening assays, as is more fully 

10 described below. Commonly used crosslinldng agents include, e.g., l,l-bis(diazoacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters with 4-azidosalicyUc 
acid, homobifimctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), bi&nctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-((p-azidophenyl)dithio)propipimidate. 

1 5 Other modifications include deamidation of glutaminyl and asparaginyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains (Creighton 
(1983) Proteins: Structure and Mo lecular Properties, pp. 79-86), acetylation of the N-tei;minal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in difTerent glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accomplished 
by altering the amino acid sequence thereof. The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for O-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutating the DNA encoding tiie lung cancer polypq)tide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the lung cancer 

polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such 

methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 

CRC Crit. Rev. Biochem.. pp. 259-306. 

5 Removal of carbohydrate moieties presrait on the lung cancer polypeptide may be 

accomplished chemically or enz3miatically or by mutational substitution of codons encoding 

for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 

techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 

Arch. Biochem. Bioohvs.. 259:52 andbv Edge, et al. (1981) Anal. Biochem. . 118:131. 

10 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 

Enzvmol.. 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene , 

15 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fiised to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-tenninus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a Ivuig cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) MsL 
Cell. Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Heipes 

Simplex virus glycoprotein D (gD) tag and its antibody (Paborsfcy, et al. (1990) Protein 

Engineering 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 

(1988) BioTechnoloev 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 

5 255:192-194); tubulm epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 266: 15163- 

15166); and the T7 gene 10 protein peptide tag (Lutz-Freyennuth, et al. (1990) Proc. Nat'l 

Acad. Sci.USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 

cancer proteins from other organisms, which are cloned and expressed as outlined below. 

10 Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be used to 
find other related lung cancer proteins from primates or other organisms. As will be 
appreciated by those in the art, particularly usefril probe and/or PGR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PGR primers are from about 1 5 to about 35 nucleotides in length, with from ^ 

15 about 20 to about 30 being preferred, and may contain inosine as needed. PGR reaction 
conditions are well known in the art (e.g., Innis, PGR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protein is to be used to generate 

20 antibodies, e.g., for immunotherapy or immxmodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the frill length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller lung cancer protein wiU be able to bind to the ftill-length protein, 

25 particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Coligan, supra; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant. Typically, the 

30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intraperitoneal irijections. The immvmizing agent may include a protein encoded by a 
nucleic acid of Tables 1 A-16 or fragment thereof or a frision protein thereof. It may be useftil 
to conjugate the immunizing agent to a protein known to be immunogenic in the mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 

albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 

Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 

trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in 

5 the art. 

The antibodies may, altematively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immvmizing agent to elicit lymphocytes that produce 

10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Altematively, the lymphocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fiagment thereof 
or a fiision protein thereof. Generally, either peripheral blood lymphocytes (*TBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if nop- 

15 human mammalian sources are desired. The lymphocytes are then fused with an 

inmiortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Coding (1986) Monoclonal Antibodies: Principles and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origin. Usually, rat or mouse myeloma cell lines are employed. 

20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfused, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the 

25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigeiL In one embodiment, one of the binding specificities is for a 

30 protein encoded by a nucleic acid of the tables or a fi:agment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Altematively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, fhe antibodies to lung cancer protein are capable of 

reducing or eliminating a biological function of a lung cancer protein, in a naked form or 

conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 

(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 

5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 

activity, growth, size or the like is preferred, with at least about 50% being particularly 

preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 

antibodies (e.g., Xenerex Biosciences, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab% 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived from non-human immimoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues fix>m a complementary determining 

15 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, afQnity and 
capacity. In some instances, Fv framework residues of a human immimoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neitiier in the recipient antibody nor in the imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions ate those of a human immunoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta 
(1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workers (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known in the 

art, including phage display libraries (Hoogenboom and Winter (1991) J. Mol. Bio l. 227:381; 

Marks, et al. (1991) J. Mol. Biol. 222:581). The techniques of Cole, et al. and Boemer, et al. 

are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 

5 Monoclonal Antibodies and Cancer Therapy, p. 77 and Boemer, et al. (1991) J. Immunol. 

147(l):86-95). Similarly, human antibodies can be made by introducing human 

immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. (1992) 
Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnology 14:845-5 1; Neuberger , 

1 5 (1996) Nature Biotechnology 14:826; and Lonberg and Huszar (1995) Intern. Rev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of Ixmg cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 

(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted proteins as described above. Without being boimd by theory, antibodies 
used for treatment, may bind and prevent the secreted protein firom binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it &om 
binding to other proteins, such as circulating Ugands or cell-associated molecules. The 
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antibody may cause down-regulation of the transmembrane lung cancer protein. The 
antibody may be a competitive, non-cotupetitive or uncompetitive inhibitor of protein binding 
to the extracellular domain of the lung cancer protein. The antibody may be an antagonist of 
the lung cancer protein or may prevent activation of a transmembrane lung cancer protein, or 
5 may induce or suppress a particular cellular pathway. In some embodiments, when the 
antibody prevents the binding of other molecules to tiie lung cancer protein, tiie antibody 
prevents growth of the cell. The antibody may also be used to target or sensitize flie cell to 
cytotoxic agents, including, but not limited to TNF-a, TNF-P, IL-1, INF-y, and TL-2, or 
chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 

complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer ihay be treated by 
administering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

15 means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates the activity of a lung cancer protein. In another 

20 aspect the tiierapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The tiierapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

Li a preferred embodiment, the ther£q)eutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the number of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curdn, 
crotin, phenomycin, enomycin, saporin, auristatin, and the hke. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung canc&r 
proteins, or binding of a radionucUde to a chelating agent that has been covalently attached to. 
the antibody. Targeting the thers^utic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afflicted area, but also serves to reduce deleterious side effects that may be associated with 

the untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 

are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein- 

5 or other aitity which facilitates entry into the cell. In one case, the antibody enters the cell by 

endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 

cell, i.e., the nucleus, an antibody theretomay contain a signal for tiiat target localization, i.e., 

a nuclear localization signal. 

10 The lung cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a of at 

least about 0.1 mM, more usually at least about 1 |jM, preferably at least about 0.1 hM or 

better, and most preferably, 0.01 |iM or better. Selectivity of binding to the specific target 

and not to related oHier sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in limg cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiles. A gene e3q>ression profile of a particular cell state 
or point of development is essentially a "fingerprint" of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including botii up- ahd down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 *T)i£rerential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a dififerentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g.. 
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normal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more states. A qualitatively 
regulated gene will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PGR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the KNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protem) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immimoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease ph^otype, can be evaluated in 
a lung cancer diagnostic test. In a preferred embodiment, gene expression monitoring is 

25 performed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PGR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 tiiese assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mKNA encoding a limg cancer protem is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementary to 

and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or 

RNA. Probes also should contain a detectable label, as defined herein. In one method the 

mRNA is detected after immobilizing the nucleic acid to be examined on a solid support such 

5 as nylon membranes and hybridizing the probe with the sample. Following washing to 

remove the non-specifically bound probe, the label is detected. In another method detection 

of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 

contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 

to hybridize with tilie target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteins fi:om the three classeis of proteins as , 

1 5 described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing lung cancer sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
e3q)ression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these protdns in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins fix>m a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be anoflier type of gel, including 
isoelectric focusing gels and the like). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of iramimoblotting are weU known to those of ordinary skill in the art 

In another preferred mediod, antibodies to the lung cancer protein find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed. 1993) Methods m Cell Bioloev: 
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Antibodies in Cell Biology, volume 37. In this method cells are contacted with fiom one to 
many antibodies to the lung cancer protein(s). Following washing to remove non-specific 
antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
label, e.g., multicolor fluorescence or confocal imaging. In another method the primary 
antibody to the lung cancer pn)tein(s) contains a detectable label, e.g., an en2yme marker that 
can act on a substrate. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of lung cancer proteins. Many other histological 
imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 
to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 
activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer fix>m 
blood, serum, plasma, stool, and otiier samples. Such samples, therefore, are useful as 
samples to be probed or tested for the presence of Imig cancer proteins. Antibodies can be 
used to detect a lung cancer protein by previously described immunoassay techniques 
including ELISA, immunoblotting (western blotting), imtnunoprecipitation, BIACORE 
technology and the like. Conversely, the presence of antibodies may indicate an immune 
response against an endogenous limg cancer protein or vaccine. 

In a preferred embodiment, in situ hybridiza.tion of labeled lung cancer nucleic acid 
probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 
tissue and/or normal tissue, are made. Jn situ hybridization (see, e.g., Ausubel, supra) is tiien 
performed. When comparing the fingerprints between an individual and a standard, the 
skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 
further understood that the genes which indicate the diagnosis may differ fix>m those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refiractoiy conditions or may be predictive of outcomes. 

In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing limg cancer sequences are iised in prognosis assays. 
As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 
pathological, or other information, in terms of long term prognosis. Again, this may be done 
on either a protein or gene level, with the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 

to biochips for the detection and quantification of lung cancer sequences in a tissue or patient. 

The assays proceed as outlined above for diagnosis. PGR metiiod may provide more 

sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening - 

10 assays or by evaluating the effect of drug candidates on a "gene expression profile" or 

expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with hig^ throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokamik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res. 6:986-94. 

15 In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the Ivmg cancer phenotype or an identified physiological 
function of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokamik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
lung cancer proteiiL '^Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 

30 gene expression in normal versus tissue undergoing limg cancer, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue* a decrease of about four-fold is often desired; similarly, a lO-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-foId increase in 
expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 
quantification of gene expression levels, or, alternatively, the gene product itself can be 
5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 
immimoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 
10 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lirng cancer sequences in a particular 
cell. Alternatively, PGR may be used. Thus, a series, e.g., of microtiter plate, may be used 
with dispensed primers in desired wells. A PGR reaction can then be performed and analyzed 
15 for each well. 

Ejqiression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compoimd is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term *test compound" or "drug candidate" or ^'modulator'* or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or. 

25 indirectly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a nomial or non-malignant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with dififereht agent concentrations to obtain a differential 
■ response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
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In one aspect, a modulator will neutralize the effect of a lung cancer protein. By 
'*neutralize" is meant that activity of a protein and the consequent effect on the cell is 
inhibited or blocked. 

In certain embodiments, combinatorial libraries of potential modulators will be 
5 screened for an ability to bind to a limg cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 

10 are employed for such an analysis. 

In one preferred embodiment, high througl^ut screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
conq)ounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to id^tify those library members (particular chemical species or subclasses) that 

15 display a desired characteristic activity. The compoimds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual ther^eutics. 

A combinatorial chemical library is a collection of diverse chemical compovmds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 

10 Ubrary, such as a polypeptide (e.g., mutein) hbraiy, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compoimds 
can be sjmthesized through such combinatorial mixing of chemical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9): 1233-1251). 

15 Preparation and screening of combinatorial chemical libraries is well known to those 

of skill in the art Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Fuika (1991) Pept. Prot. Res. 37:487- 
493, Houghton, et al. (1991) Nature. 354:84-88), peptoids (PCX PubHcation No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oUgomers (PCT 

JO Publication WO 92/00091), benzodiazepines (U.S. Pat No. 5,288,5 14), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), viuylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
114:6568), noi^eptidal peptidomimetics with a Beta-D-Glucose scaffolding CHirschmann, et 
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al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntheses of small 

compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocarbamates 

(Cho, et al. (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 

J. Org. Chem. 59:658). See, generally, Gordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 

5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 

Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 

14(3):309-314, and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 

Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 

(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,5 14; and the like). 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

15 Wobum, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate H, Zymark Corporation, Hopkinton, Mass.; Orca, 

Hewlett-Packarf, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist. The above devices, with appropriate modification, are suitable for use with flie 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to hig;h throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
expression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins. 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 

binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

throughput methods of screening for hgand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 

5 Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 

typically automate procedures, including sample and reagent pipetting, Uquid dispensing, 

timed incubations, and final readings of the microplate in detector(s) appropriate for the 

assay. These configurable systems provide high throughput and rapid start up as well as a 

1 0 high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or , 

15 fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bact^al, fimgal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred- Particularly 

20 useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of firom about 5 to about 30 
amino acids, with firom about 5 to about 20 amino acids being preferred, and fi-om about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or 'Ijiased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 

combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fiilly randomized, with no sequence preferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 
of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 
5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 
sterically biased (either small or large) residues, towards the creation of nucleic acid binding 
domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 
threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of lung cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

natiu'ally occurring nucleic acids, random nucleic acids, or **biased" random nucleic acids. 
Digests of procaryotic or eucaiyotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

15 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PGR performed as impropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FlTC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an eiLzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
siibstrate produces a product that can be detected. Altematively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridi2ation conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

nm under stringency conditions which allow formation of the label probe hybridization 

complex only in the presence of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodjmamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,681,697: Thus it may be desirable to perform certain steps at 

1 5 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or backgroxmd interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methock and purity of the target. 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual gaies, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotj^e. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important ia a particular state, screens can be performed to identify modulators that alter 

e?q)ression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
eTqiressed gene. Again, having identified the importance of a gene in a particular state, 
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=c:.casa«pert,nncdtoidentifyage„t.tebinda„d/ormoduWefl„bioCS^'of 
the gene product, or evaluate genetic polymOTphisms. 

O'^^^^'^^-'^euedfor.hosethatareinducedinresponsctoacandida.eagent 
. ^"""■a^gamodulatorbaaeduponi.sabm.ytosuppre.salungcaucarexpression 
pattern leading to a „om,al expression pattern, or to modulate a single lung ca»=er gene 

expression profile so as *,mi„^c the expression of a^e gene fion, nonnal tissucaacreen as 
descnbed above can be perfonned to identify genes that are specificaUy modulated in 
response to the agent. Comparing expression proffles betv^een normal tissue and agent 
^^'^-^^--r.issuerevea.agenesa^tareno.expressedinnom.altissueorn^^ 
l^ue^butareexpressedinagenttreatedtissue. These agent-specific sequences can be 
.dent^edandusedbymethodsdescribedhereinforlungcancergenesorproteins m 

particular fl>ese sequences and theproteins they encode find use in maridng or identifying 
agent treated ceUs. In addition, antibodies can be raised against the agent induced proteir! 
and used to target novel therapeutics to the treated lung cancer tissue sample 
5 Thu^ in one embodimert, a test compound is administered to a population of lung 

cancer cells, that have an associat«i lung cancer expzession profile. By ••administra&^^^ 
oonfc«=.ing- herein is mean, that a.e candidate agent is added to the cells in such a manner as 

to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
acfon a. the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

' '=-'>Ma»''Sent(i.e.,apeptide)maybepu«m.oaviralcons.ructsuchasanadenoviralor 
retroviral constmct, and added fc, the cell, such that expression of the peptide agent is 
accomplished. e.g.,PCTUS97/01019: Regulatable gene then„y systems can also be used 
Once a test con^und has been admmistered to the cells, the ceUs can be washed if 

desired and aroaUowedto incubate underproferably physiological condifions for some 
penod of time. He cells are fl.en harvested and a new gene expression profile is genemted. 
as outlined herein. 

Thus. e.g., lung cancer or noa-mahgnant tissue may be screened fi>r agents that 

modulate. e.g., induce or suppressalungcancerphenotypcAchangetaat least one gene 
proferablymany.oftheexpressionprofilemdicatesthattheagenthasaneffeconlung ' 
cancer activify. By defining suchasignature for the lung cancer phenotype. screens for new 
drugs tha, alter file phenotype can be devfaed. With this approach. a,e drug targe.needno.be 

too™ andneedno. be representedmthe original expression scroeningplatfonn,nord 
tiie level of traascrip. fi>r the texge. protein need to change. 
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Measure of lung cancer polypeptide activity, or of lung cancer or the lung cancer 
phenotype can be performed using a variety of assays. For example, the effects of the test 
compounds upon the function of the metastatic polypeptides can be measured by examining 
parameters described above. A suitable physiological change that affects activity can be used 
5 to assess the influence of a test compound on the polypeptides of this invention. When the 
functional consequmces are determined using intact cells or animals, one can also measure a 
variety of effects such as, in the case of lung cancer associated with tumors, tumor growtii, 
tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 
and imcharacterized genetic markers (e.g., northern blots), changes in cell metaboHsm such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian lung cancer polypeptide is typically used, e.g., 
moiise, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 incubated for a suitable amount of time, e.g., &om 0.5 to 48 hours. In one embodiment, the 
lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELISA and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a fragment thereof. For measiu-ement of mRNA, amplification, e.g., using 

20 PGR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 
indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using a limg cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or P-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 



63 



wo 02/086443 PCTAJS02/12476 
genes are sometimes referred to herein as "lung cancer proteins." The lung cancer protein 

may be a fragment, or alternatively, be the full length protein to a fragment shown herein. 

Li one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immimoassays are run to determine the amount of protein present Alternatively, 
cells comprising the limg cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a Ivmg cancer - 
15 protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the hiraian lung cancer protein, although other 
mammalian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodimoits, as outlined herein, variant or derivative lung cancer proteins 
may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffusably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microliter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated fix)m soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be solid or porous and of a convenient sh^e. 
Examples of suitable msoluble supports include microliter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystjnrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microliter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner ofbindingofthe composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffbsable. Preferred methods of binding include the 
use of antibodies (which do not sterically block either the ligand binding site or acti\^tion 
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sequence when the protein is bound to the support), direct binding to "sticky" or ionic 

supports, chemical crosslinking, the synfliesis of the protein or agent on the surfece, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel binding agents include specific antibodies, non- 
natural binding agents idraitified in screens of chemical libraries, peptide analogs, etc. Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protein 
bindmg assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
functional assays phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to tiie lung cancer 

15 protein may be done in a number of ways. In a preferred embodiment, the compoimd is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support. Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., ^^^I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circimistances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compovmd. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be pCTformed at 
a temperature which fecilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
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between 0. 1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 

away. The second component is then added, and the presence or absence of the labeled 

component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 
5 compound. Displacement of the competitor is an indication that the test compound is binding 
to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 
activity of the lung cancer protein. In this embodiment, either component can be labeled. 
Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 
displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 
10 label on the support indicates displacement. 

In an alternative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compoimd is labeled, the presence of the label on the support, coupled with a lack of , • 
1 5 competitor binding, may indicate that the test compound is capable of binding to the lung 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a competitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protein, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the lung cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to ttie 
native lung cancer protein, but cannot bind to modified lung cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the abiUty to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufficient for the binding of the agent to the protein. 

Following incubation, samples are washed free of non-specifically boimd material and the 

amovint of bound, generally labeled agent determined. For example, where a radiolabel is 

employed, the samples may be counted in a scintillation counter to determine the amount of 

5 bound compound. 

A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the reqtiisite binding. 

In a preferred embodiment, the invention provides mediods for screening for a 
compoiind capable of modulating the activity of a limg cancer protein. The methods 
comprise adding a test compoxmd, as defined above, to a cell comprising lung cancer 
15 proteins. Preferred cell types include ahnost any cell. The cells contain a recombinant 
nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Compounds 
with pharmacological activity are able to enhance or interfere with the activity of the lung 
25 cancer protein. Once identified, similar structures are evaluated to identify critical structural 
feature of the compoimd. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 
method comprises administration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting l\mg cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a fiuther embodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., conqjrising administration of a lung cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 

tiiose of skill in the art, as described below. 



Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A therapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft 

15 Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) Culture of A nimal Tell s a Manual of Basic Technique (^^ ed.), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al. (1996), 
supra, herein incorporated by reference. 



20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, howevCT, flie cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with (^H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor siq)pressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with (^H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 

68 



wo 02/086443 PCT/US02/12476 
non-limiting medium conditions. The percentage of cells labeling with (^H)-thymidine is 

determined autoradiographically. See, Freshney (1994), supra. 



Growth factor or serum dependence 
5 Transformed cells typically have a lower serum dependence than their normal 

counterparts (see, e.g., Temin (1966) J.Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) L 
Exp. Med. 13 1 :836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

10 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers'*) than their normal coimterparts. For example, plasminogen activator (PA) 
is released from human glioma al a higher level than firom normal brain cells (see, e.g., 

1 5 Gullino, " Angiogenesis, tumor vascularization, and potential interference with tumor growtii" 
in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184). Similarly, Tumor 
angiogenesis fector (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer Biol.^ . 
Various techniques which measure the release of these &ctors are described in 

20 Freshney (1994), stq>ra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 

Strickland and Beers (1976) J. Biol. Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184; Freshney 
Anticancer Res. 5:111-130(1985). 

25 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate Ixmg cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
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Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 
invasion of host cells can be measured by using filters coated with Matrigel or some other 
extracellular matrix constituent. Penetration into the gel, or through to the distal side of the 
filter, is rated as invasiveness, and rated histologically by number of cells and distance 
5 moved, or by prelabeling the cells with ^^^I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 



Tumor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-suppressed mice. Knock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or in which a limg cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a mark^ gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting the endogenous lung cancer gene with a mutated 

1 5 version of the lung cancer gene, or by mutating tiie endogenous lung cancer gene, e.g., by 
exposure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ ceUs partially derived fi:om the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
derived according to Hogan, et al. (1988) Manipulating the Mouse Embrvo: ALaboratorv 
Manual Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcin omas and 

25 Embrvonic Stem Cells: A Practical Approach. , IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For exaic^le, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) L 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41 :52) 

30 can be used as a host. Transplantable tumor cells (typically about 10^ cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressmg a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time. 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 
dimensions) and compared to the control. Tumors that have statistically significant reduction 
(using, e.g.. Student's T test) are said to have inhibited growth. 



5 Polynucleotide modulators of lung cancer 

Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a limg cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic acid sequence, e.g., a lung cancer protein mRNA, or a subsequence 
thereof. Binding of the antisense polynucleotide to the mRNA reduces the translation and/or 
stability of the mKNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 

15 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the limg cancer protein mRNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herem include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be anployed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a Ugand or activator thereof. Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a fixigment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g.. Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et al. 
(1988) BioTechniaues 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 
manner. See, e.g., Bnmielkamp, et al. f2002;> Sciencexpress (21March2002); Sharp (1999) 
5 Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. In mammalian 



cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
be effective at inducing an RNAi response. See, e.g., Elbashir, et al, (2001) Nature 411:494- 
498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 



In addition to antisense poljmucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancar-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

1 5 been described, including group I ribo^ones, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et aL (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapv 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (19994) Human Gene Therapv 5:1 151-120; and Yamada, et al. 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other Hgands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 



10 
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formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 
5 are provided. Li one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods comprise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 
number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

1 0 regulated in lung cancer, such state may be reversed by increasing the amoimt of lung cancer 
gene product in the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

15 as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, the activity 
of the endogenous lung cancer gaie is decreased, e.g., by the administration of a Ixmg cancer - 
antisense or RNAi nucleic ^id. 

In one onbodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to affinity chromatography 
columns. These colunms riiay then be used to purify lung cancer antibodies useful for 
production, diagnostic, or ther^eutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
afiSnity chromatogrs^hy columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 

30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by flieory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., detennining all or part of the 

sequence of at least one endogenous lung cancer genes in a cell. In a preferred embodiment, 

the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 

determining all or part of the sequence of at least one lung cancer gene of the individual, 

5 This is generally done in at least one tissue of the individual, and may include the evaluation 

of a number of tissues or different samples of the same tissue. The method may include 

comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 

a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
10 sequence of a known lung cancer gene to determine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined herein, 

15 In a preferred embodiment, the lung cancer genes are used as probes to determine the 

number of copies of the lung cancer gene in the genome. 

In another preferred embodiment, the lung cancer genes are used as probes to 

determine the chromosomal localization of the lung cancer genes, information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormahties such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery: Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 08247691 8X, 0824712692, 0824716981; Lloyd (1999) The Art. Science and Tec hnologv of 
Pharmaceutical Compounding : and Pickar (1999) Dosage Calculations') . Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, srac, diet, time of administration. 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 

with routine experimentation by those skilled in the art. 

A "patient" for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human then^y and 

S veterinary £q)pUcations. In tiie prefeired embodiment the patient is a mammal, preferably a 

primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 

invention can be done in a variety of ways, including, but not limited to, orally, 

subcutaneously, intravenously, intranasally, transdermally, intiaperitoneally, intramuscularly, 

10 intrapulmonaiy, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient. In tiie preferred embodiment, the 

15 pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. 'Tharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inotganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
'Tharmaceutically acceptable base addition salts" include those derived from inorganic bases 

25 such as sodium, potassium, lithiiun, ammonium, calciiun, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassimn, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases' include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of tiie following: 
carrier proteins such as serum albumii^ buffers; fillers such as microcrystalline cellulose, 
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lactose, com and other starches; binding agents; sweeteners and other flavoring agents; 

coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 

forms depending upon the method of administration. For example, unit dosage forms 

5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 

and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 

constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 

protected from digestion. This is typically accomplished eiliier by complexing the 

molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a l\mg cancer protein 
modulator dissolved in a phaimaceutically acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

15 are sterile and generally free of undesirable matter. These conqjositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
pharmaceutically acceptable auxiliary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the like. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volimies, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g.. 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Gilman: The Phanmacologial Basis of Therapeutics^ 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenteraUy 

30 ^lininistrable compositions will be known or apparent to those skilled in the art, e.g.. 

Remington's Pharmaceutical Science and Goodman and Gilman, The Pharmacologial Basis 



76 



wo 02/086443 PCT/US02/12476 

The compositions containing modulators of lung cancer proteins can be administered 
for therapeutic or prophylactic treatments. In therapeutic applications, compositions are 
administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to 
cure or at least partially arrest the disease and its complications. An amoimt adequate to 
5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for this 
use will dqjend upon the severity of the disease and the general state of the patient's health. 
Single or multiple administrations of the compositions may be adndnistered depending on the 
dosage and frequency as required and tolerated by the patient. In any event, the composition 
should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the mammal, the particular cancer being prevrated, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

15 used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer jprotein-modulating compounds can 

20 be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
{msociated polypq>tides and nucleic acids using in vitro (cell-free), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein of nucleic acid is appUcation specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g.. 
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Berger and Kimmel, Guide to Molecular Cloning Techniques. Meth ods in Enzvmologv 

volume 152 (Berger). Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 

1999), and Sambrook, et al. (1989) Molecular Cloning - A L aboratory Manual (2nd ed.. Vol. 

1-3). 

5 In a preferred embodiment, Ivmg cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Limg cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341), 
peptide compositions enc£^sulated in poly^L-lactide-co-glycolide) QTLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) Clin Exo Immunol. 1 13:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g.. Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tarn 

20 (1996) J. Immunol. Methods 196:17-32), peptides formulated as multivalent peptides; 

peptides for use in ballistic deliveiy systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufinann (ed. 1996) Concepts in vaccine development: 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technoloev 4:790; Top, et al. (1971) J. Infect Pis. 124:148; Chanda, et al. 

25 (1990) Viroloev 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 1 1 :293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Today 17:131), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufinann (ed. 1996) Concepts in vaccine development: Cease and Berzofsky (1994) Amiu. 
Rev. Immunol. 12:923 and Eldridge, et ai. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as those of Avant 

Immimotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 

designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 

5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 

Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 

as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 

MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 

Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

10 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 

tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 

polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 

Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 

used as adjuvants. 

1 5 Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et. al, (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 "naked DNA", facilitated (bupivicaine, polymers, peptide-medlated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun**) or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This ^proach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences fliat encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immimogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in iiruuunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Cahnette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 35 1 :456-460. A wide variety of other vectors useftil 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors. Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 

al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 

Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 

5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 

or a tissue-specific promoter for expression in a lung cancer patient. The lung cancer gene 

used for DNA vaccines can encode full-length lung cancer proteins, but more preferably 

encodes portions of the lung cancer proteins including peptides derived fiom the lung cancer 

protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived from a lung cancer gene. For example, limg cancer- 
associated genes or sequence encoding subfragments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicity in the context of Class I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment lung cancer genes find use in generating animal 

models of lung cancer. When the limg cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherein antisense or inhibitory RNA directed 
to the lung cancer gene will also duninish or repress expression of the gene. Animal models 
of lung cancer find ixse in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an stppropriate gene targeting 
vector, will result in the absence or increased expression of the lung canc^ protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that Ihe lung cancer protein is overe^ressed in limg cancer. As 

30 such, transgenic animals can be generated that overexpress the lung canc^ proteiiL 

Depending on the desired expression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 

additionally usefiil in screening for modulators to treat lung cancer. 



Kits for Use in Diagnostic and/or Prognostic Applications 
5 For use in diagnostic, research, and therapeutic applications suggested above, kits are 

also provided by the invention. In diagnostic and research j^plications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 

10 lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

Li addition, the kits may include instructional materials containing instmctions (e.g., 
protocols) for the practice of the methods of this invention. While the instructional materials 
typically comprise written or printed materials they are not limited to such. A medium 

15 capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not Umited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 
The present invention also provides for kits for screening for modulators of lung 

20 cancer-associated sequences. Such kits can be prepared fixsin readily available materials and 
reagents. For example, such kits can comprise one or more of the foUowmg materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 

25 invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1 : Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
5 analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 

described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 
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Tables 1A and IB were previously filed on Apr8 18.2001 In USSN 6(U284.770 (18501-001SOOUS) and on November 29, 2001 in USSN 60034.370 
(18501-001SOUS) 
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M398209 






121362 


AA405500 


Hs.97932 


Chondromodulln 1 precursor 


121389 


M4056S7 


Hs.128791 


CGI-09 prol^ 


121791 


AA423978 


Hs.293317 


'ESTs, Weakly similar to JM27 [H.sapens 


123005 


AM79726 


Hs.105577 


ESTs 


123044 


AA481549 


Hs.130881 


B-cell CLUymphoma 11A (zinc linger pro 


123160 


AA488687 


Hs.284235 




123479 


AA599469 


HS.1350S6 


done RP5-8S0E9 on chromosome 20 


123571 


AA8089S6 


Hs.1 12619 


'ESTs, Weakly similar to PQ0109 Purkinje 


123829 


AA620697 


HS.1122C8 


XAGE-1 prolan 


124006 


060302 


Hs.108977 


ESTs 


124059 


F13673 


Hs.99769 


ESTs 


124960 
125218 


115386 


Hs.194766 


Sslzure related gene 6 (mouse)-Iike 


W73561 


Hs.1 10024 


NADH:ubiquIno(w oxidoreduetase tAAQ subu 


1^53 


R06041 


Hs.18048 


"Melanoma antigen. My A, 10^ 


12S7S9 


AA42ffi87 


Hs.82226 


GlycoprDtsIn (transmemlirane) nmb 


12S972 


M434562 


Hs.35406 


*ESTs, Highly sin^arto unnarned protein 


125994 
126395 


H55782 


Hs270799 


EST 


N7ai92 


Hs^789S6 


Hypothetical prolan FIJ12929 


126645 


A1167942 


HS.S1835 


STEAP1 (Homo sapiens BAG done RG041D1 1 


127221 


AI3S4332 


HS.7236S 




127479 


MS13722 


Hs.1797a 


coEagen; type X; alpha 1 (Schmld metapn 


128192 


Ma04246 




KIAA108S protetn 


128610 


L38608 




acBvaled leucocyte cell adhesion malecu 


128777 


1)46006 


KS.10S26 


{^teine and gtynne-rtch protdn 2 


128924 


AA234962 


K&265S7 


Rakophflin 3 








'Sdute carrier tenDy 2 (ladiltated gl 


129099 


H50398 


Hs.108660 


•ATP-tjInding cassette, sut)-lainily C (OFT 


129404 


AA172056 


Hs.111128 


ESTs 


129466 


L42S63 




'Genbank Homo sapiens keraUn 6 Isolbrm 


123605 
129628 


S72493 


Hs.115947 


KeraBn 16 (local nan-epid«niolyilc patm 


U26727 


Hs.1174 


•Cydin-depandeiitUnase Inhlbttor 2A (m 


130023 


X13461 


H&23960O 


Calm>dufin4ike3 


130080 


X14850 


Ks.147097 


■H2A Wstooa family. memlKr X* 


130385 


AA128474 


HS.1S5223 


stanniocalcin 2 
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130410 


V015t4 


Hs.155«l 


Alplia-Mapnrteh 


ass 


a63 


130441 


03^35 


Hs.301387 


-Human DNAnCmRNA. partial cds* 


1.15 


3.65 


130482 


132866 


Hs.1578 


Bacuhwlral lAP reped^ntainlng 5 (sur 
Pftuitanf twnor-lransfiDnrang 1 




1.88 


130553 


AA430032 


HS.2S2S87 


a92 


1.98 


130577 


M35410 


Hs.162 


lnsunn4B« gniwlh factor binding prote 


1.17 


4.7 


130627 


L23808 


Hs.1695 


Matrix mstallaproteinase 12(macn)phsQs 


0.69 


4.05 


130800 


AA223388 


Hs.19574 


ESTs; Weakly similar to Itatanin p80 subu 
ESTs 


1.13 


2.41 


130939 


AAS98689 


Hs^1400 


0.8 


0.89 


131046 


X02S30 


H5.2248 


IMTERFERON43AMMAWDUCED PROTEIN PRECURS 0.8 


1.15 


131244 


038076 


Hs^4763 


RANUndtagprateinl 


1.13 


1.85 


131877 


J04088 


Hs.156346 


Topoisomaase fptV^ II dpha (ITOkO) 


1 


1 


131927 


AA461549 


Hs^TSO 


*Ooubteoart«; Bssencaphaly. X-Bnked ( 


0.81 


a62 


131965 
131978 


W90146 
D80008 


Hs.3SgS2 
Ks.36232 


ESTs 

KIAAOl 86 gene product 


0.74 


3.27 


132354 


L0S187 


Hs.211913 


Small pioHne-ilcti protein lA 




1.43 


132543 


AA417152 


Hs.5101 






4.27 


13^32 


N59764 


Hs.5398 




1 


1.08 


132653 


U31201 


H3.54451 






1 


132659 


Z7S190 


Hs.54481 






a89 


132710 


W93726 


Hs.55279 


'Sertna (or cysteine) proteinase Inhibit 


a64 


4.41 


132758 


W52432 


HS.S6105 


'ESTs, Weddy similar to WDNM RAT WDNM1 




2.08 


132767 


LDSISB 


Hs.231622 






1.66 


132816 


M74S42 


HS.S7S 






0.55 


132990 


AA458761 


Hs.18387 






3.53 


133070 


U69S11 


Hs.64311 






2 


133282 


U52960 


Hs,286145 






2.7 


133317 


AA215299 


Hs.70830 


U6 snRNA-associated Sn>4ike protein l^m7 




1.42 


133370 


M156897 


Hs.72157 


Homo saddens mRNA; cDHA OKFZp564l 1 922 




2.55 


133391 


X57579 


Hs.727 






1.76 


133832 
134032 


H03387 
Z81328 


HsJ341305 
H3.78589 




102 


1.39 
1 


134168 


AA398908 


H3.181634 


*Homo sapiens cONA: FU23502fi5, dona 






134218 


AA227480 


Hs.80205 






Z48 


134405 


R67275 


Hs.82772 




2.86 


134453 


X70633 


Hs.83484 


SRY (sex determinlr^ region Yj-box 4 




a78 


134470 


X54342 


Hs.83758 


COC23 protein kinase 2 




4.11 


134645 


U87459 


Hs.167379 






a83 


134781 
135002 


M17183 
U19147 


H&89B2S 
Hs.272484 






1 
1 


100040 


M97g3S 




AFFX control: STATl 




1.25 


101201 


L22524 


H3.2256 






as 


101664 


M607S2 


H5.121017 






1 


102025 


U03911 


HS.7B934 






1.61 


102031 


UM898 


Hs.2156 


RARialated or^ian receptor A 


l" 


1 


102221 


U24576 




UM domain (»^ 4 


1 


1 


102270 


U302S5 


H3.7SS88 


pftosphogluconata dsftydrogenase 




1.43 


102339 


U37022 


Hs.95577 


cydtrntependent Unase 4 


o!88 


1.32 


102391 


U41668 


Hs.77494 


deoxyguanoslne kinase 


1.07 


1.58 


103000 


XS1956 


H5.146560 






1.49 


103395 


X94754 


Hs.l 19503 






1.32 


105638 


AA281599 


Hs.20418 


Homo sapiens mRNA for for histone H2B; o 




1.25 


105726 


AA292328 


HS.97S4 






1.48 


114841 


AA234722 


Hs.S540a 


ESTs; (Axlwalsly simOar to CALQUM-OEPE 




1.56 


115206 


AA262491 


Hs.1 86572 




1 


1 


115906 


AA436616 


Hs.82302 






2.S2 


119132 


R49046 


Hs.1 07911 


ATP-blndIng cassette; sub-femliy B (t/IOR/ 


11 


1.51 


124163 


H3053g 


Hs.189838 




1 


1 


126487 


AM82505 


Hs.1 84601 


solute carrier MIy 7 (caBonIc an^ 


1.01 


1.46 


127141 


AA307g60 


HS.7S478 


KIAA0956 protein 


0,85 


1.4 


128034 


AA90S754 


Hs.75103 




1 


1.18 


128609 


AA234365 


Hs.102456 




1 


1.5 


128895 
130199 
130524 


Z48S79 
U8999S 


Hs.1 06985 
Hs.l720a 
Hs.159234 


a disintegrtn and metaUoproteasa domain 
fortdieadboxEl 


l" 
1 


2 
1 
1 


133000 
133658 
135047 


U24152 
M25756 
AA4e0466 


Hs.62402 
Hs.75426 
Hs.93597 


p21/Cdc42/Rac1-aclKaled Unase 1 (yaast 

ESTs 


1 

1 


1 
1 
1 


1000S3 


M27830 




AFFXcontrat: 288 ribosomal RNA 


0.88 


1.53 


100114 


000596 


Hs.82962 


thymidylate syntfistase 


0.68 


1.86 


100128 


01 1094 


Hs.61153 


ptoteasoms (prosoms; macropain} 26S subu 


1.29 


Z03 


100154 


014657 


Hs.81892 


KIAAOIOl gene prodwl 


0.71 


4.26 


100161 


014694 


H3.77329 


pfiosphatldylserine synthase 1 


1.02 


1.56 


100168 


D14874 


HsJ94 


adrenomedullln 


0.46 


1.17 


100187 


017793 


Hs.78183 


aldo^eto reductase taniy 1; member C3 


1 


1 


100188 


D21063 


Hs.57101 


n^niduxMnosoms m^ntenance detiden] (S. 


a97 




100217 


D2BB00 


H5.83545 


proteasome (prosome; manxjpdn] subunil; 
—Human mRNA for annexln II, SVTR (seq 


1.13 


1.9 


100220 


028364 




1.11 




100287 




Hs.1600 


chs^KTonln contMg TCP1; subunit 5 (e 


1.13 


£09 


(00297 


049489 


fte.182429 


pretebi dsullide isomerase^ated pro) 


0.92 


1.78 


100330 


055716 


HS.771S2 


nMchromosome m^tenancs deSdant (S. 


1.07 


1.61 


1003SS 


07812S 




"*Hamo sapiens mi^ for squalene epoxid 


0.98 


1.87 


100364 


078586 


Hs.154868 


carbamoyl-phosphate syntlielase % aspait 


1.49 


Z46 


100368 


079987 


Hs.153479 


extra spindle poles; S. cei8vl$ias; homo 


0.59 


1.32 


100398 


084557 


Hs.155452 




1.08 


1.9 


100438 


D87448 


Hs.91417 


topdsonerase ^NA) B binding protein 


1 


Z15 
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D87953 


Hs.75789 


N-fnyc downsfream fegitfaled 








HG1153-HT1153 


NucteosMe Diphosphate Kinase Nm23-H2s 








HG174+1T174 




Oesmoplakin 1 






100528 


HG182&+fn857 


'"Newn, aia-Derived"" 






100S61 


HG2874-HT3018 


Ribosomal Protan L39 Homolog 






100667 


HG2981-Hni27 


""Eplcan, AH Splice 1 1™ 






100830 


HG4074-HT4344 


Rad2 






1010S1 


K03515 


Hs.944 


glucose phosphate isomerase 








110838 


Hs. 167460 


spTdng factor; siginihefeerina-ridl 3 






101162 


L14595 


Hs.174203 


solute canier^ly 1 (^utamateMeutr 








119686 


Hs.737g8 


macrophc^s migtsiian inhniloiy lector ( 








L19779 


HS.79S 


H2A histone family; member 0 






101216 


L2S876 


Hs.84113 


cydln-dependent kinase lnhihitor3 (COK 






101228 


L27706 


H3.82916 


chapeforan containing TCPl ; subunit 6A ( 






101233 


1^9008 


Hs.878 


soriiitol dehydiogenase 






101247 


13^01 


Hs.78802. 


glycogen synfliase Idnase 3 beta 






101332 


L47276 




—Homo sapiens (cell line HL-6) dpha I 
lnteriaukin-1 receptor-associated idnase 


0.69 




101342 


L76191 


Hs.182018 






10139S 


M15795 


Hs.78996 


pralilerating cell nudear antigen 


0.95 




101423 


M18391 


Hs.89839 


EphAI 






101445 


M21259 


Hs.i086 


sman nuclear ribanudeoprotein polypept 






101505 


M27396 


HS.7S692 


asparaglne synthetase 






101525 


M29536 


Hs.12163 


eultaryolic translation initiation factor 






101535 


M30448 


HS.2S1E69 


casein kinase 2; beta potypepEde 






101607 


M38890 


Hs.1244 


COS antigen (p24] 






101624 


MS5998 




"'Human alpha-l collagen type 1 gene, 3 






101758 


M77836 


Hs.79217 


pymilMe-SKiarboxyl^ reductase 1 






101839 


M93038 


Hs.692 


fuemtirane component; chromosomal 4; surfa 






101853 


M94%2 


Hs.76084 








W1977 


S83364 




"putative RabS^iteracdng pnilsin {d 


0.89 




101992 


U01038 


HS.77S97 


polo {I3rDsophia)-llke kinase 


a68 




102009 


U02680 


Ks.82643 


protein tyrosine Mnase 9 


1.23 


3.35 


102012 


U03057 


Hs.118400 


singed (OrasoplulaHite (sea iBdi'm fas 


ass 


1.88 


102039 


U05861 


H5^ig57 


aldo*t(eto reductase family 1 ; member CI 


0.93 


2.32 


102123 


U14518 


Hs.1594 


cenlromafB protein A (ITkD) 


1 




102130 


U15009 


HS.157S 


snaS\ nuclear nlxinucteoproteln D3 polyp 


a89 


1.42 


102148 


U16954 


Hs.75823 


AliWused gene from diromosome 1q 






102210 


U23028 


Hi2437 


eukaryotic translatkin Inlliatbn factor 


1.01 




102220 


U24389 


Ks.65436 


1^ oiddas&Jlke 1 


1.15 




102260 


U28386 


Hs.15g557 


kaiyophsiln dpha 2 (RAG cohort 1 ; impor 






102330 


U35451 


Hs.77254 


chremobox homolog 1 (DrosophSa HP1 beta 


1,05 


1.7 


102423 


U447S4 


H5.179312 


smaP nuclear (WA acBvafing complex; po 


1.14 




102455 


U48705 


Hs,75562 


discoidin domain receptor femily; member 


1.05 




102499 


U51478 


Hs.76941 


ATPas? Nat/K+ transporting; beta 3 poly 






102522 


US3347 


HS.183S56 


solute canter family 1 (neutral amino a 






102S90 


U82136 




—Homo sapiens enterocyte differentlali 


1.11 




102676 


U72514 


HS.1204S 


putative protein 


1.04 




102687 


U73379 


Hs.g3002 


ubiquffln canter protein E2-C 


0.86 




102704 


U76838 


Hs.54089 


BPCAt associated RING domain 1 






102781 


U83843 




—Human HW-I Nef Interacting proldn ( 






102784 


U85658 


Hs.61796 


transcription bdor AP-2 gamma (acfivat 








U91327 


HS.64S6 


chaperonin containing TCPl; subunil 2 (b 






102935 


X13482 


HS.80S06 


small nudear ribonudeoprateln potypefd 


1.21 




10K72 


X16662 


Hs.87268 


anne)dnA8 






102983 
103023 


X17620 
XS3793 


Hs.1 18638 
Hs.1 17950 


non-metastalte cells 1; protein (NM23A) 
multi^nctional potypepBde similar to S 


1.03 






X54941 


Hs.77550 


COC28 protein Idruasa 1 






103075 


X59543 


Hs.2934 


ribonucleotide reductase Ml polypeptide 


1.11 




103168 


X88314 


Hs.2704 


glutathione peroxidase 2 (gasbolntestin 






10318S 


X69910 


Hs.74368 


transmembrane protdn (63kl}); endoptasm] 


1.01 




103212 


X73874 


Hs.2393 


phosphorylase kinase; sdpha 1 (rausde) 
chapaonbi containing TCPl; subunil 3 Cg 








X74801 


Hs.1708 






103260 


X78416 


HS.315S 


casein; alpha 






103262 


X78S65 


Hs^04133 


hexabradiion (tenasdn d cytotadin} 






103330 


X8S373 


Hs.77496 


sndl nudearilbanucleaproteln polypept 


1.12 




103364 


X90872 


Hb.75854 


SULTIC suHolransferase 






103375 


X91868 


Hs.54416 


sine acuSs homeobox (t^osophila) Itomolo 


1 




103391 


X94453 


Hs.114366 


pym)flne-5<»it)axylata synSietase (glut 




1.53 


103404 


Xg5S86 


Hs.78596 


pmteasome (prosome; macropaln) subunit; 






103437 


X98260 


HS.B2254 








103448 


X99133 


Hs.204238 


BpocaBn 2 (oncogene 24p3) 






103605 


Z3S4Q2 


Hs.194657 


cadherln 1; E-cadherin (epithelial} 






103646 




Hs.2340 


juRcfkui piakoglobin 






103658 


Z74615 


Hs.172g28 


odlagen:typel;a^1 






103774 


AA092898 


Hs.g2918 


ESTs; Wa^tdy sinflar to R07G3.8 (C^ga 






104261 


AF008442 


Hs^09 ' 


IWA polymerase 1 subunit 


?r 






C02193 


Hs.85222 


ESTs; Weakly sImSar to R27D90_2 (H^apl 




2.49 


104289 


C16281 


Hs.75478 


KIAA6956pn^ 


US 


1.68 


104434 


L02870 


Hs.1640 


cdlagen; type Vit alpha 1 (epidennolys 


1.04 


1.49 


104453 


M19169 


Hs.123114 


cystaBnSN 


a38 


0.76 


104611 


R98280 


Hs.125845 


ribuIas»6-phosphate-S«pimerase 


1.08 


Z2S 


104758 


AA024661 


Hs.7010 


ESTs; WbaMy sbidlar to ACYUCOA DEHyORO 


1.14 


1.65 


105114 


AAt56532 


Hs.1180t 


adenosine A2b receptor pseudogene 


asi 


1.38 


105132 


M159SG1 


Hs.247280 


HBVassodaledtactor 


1.08 


1.7 


105174 


AA186813 


Hs.34744 


ESTs 


0.95 
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ESTs 
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ESTs 
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Tables 2A-8C were previously Hied on Novembers, 2001 In USSN 60/339^45 (18501-004100US) 

Table 2A shows 504 genes down-regulated In lung tumois relaSve to nornia! lung and chronically diseased lung. Chronically diseased lung samples represent chronic non- 
malignanl lung diseases such as fibrosis, emphysema, and bronchitis. These genes were selected from 53580 probesels on the EosyA^metiix Hu03 Genechip aney. Gene 
5 expression data for each piobeset obtained firom this analysis was expressed as average intensity (Al). a normalized value rellectkig the relafive leval of mRNA ej<~»>»'"- 



R2: median of/U for normal lung samples divided ljy90lh percentile of Al for adenocaicinoma and squamous cell caretnoma lung tumor samples. 

R3: median of Al for normal hing samples minus the 1 59i percentile of Al lor all normal lung, chronically diseased lung and tumor samples divided by 

15 IheSOIh parceriSledAI for adenocarcinoma and squamous cell carcinoma lung tumor samples nmuslhelSlhparcentiterf 

lung, chronically diseased lung and tumor samples. 
R4: average of Al for normal lung samples divided by average Al for squamous cell c»cinoma and adenocarcinoma limg tumors. 

R5: median of Al for normal lung samples divided by the 90lh percenttle of Al for adenocanaromas. 

R5: median of Al for normd lung samples mbius the 1 5th pen:ent3e of Al for all normal lung, chronically diseased lung and tumor samples divided by the SOth 

20 pereenffie of Altbr adenocarcuwmas nihus the 1 5lh percenfiie of A! for afl normd lung, chtonlcany diseased lung and tumor samples. 

Kh average of Al for nonnal hmg samples dhrlded by Ihe 90lh percentile of Al for squamous cell cadnomas. 

RK median of Al tor noimd lung samples minus the ISih pemenfile of Al fbr all normal lung, chroidcaly diseased hmg and tumor samples divided by the SOth 

pereentile of Al for squamous cell cara'nonas minus the 1 SUi percenlOe of Al for di nomial lung, chronically diseased lung and tumor samples. 

25 Pkey ExAccn UnigenelO UnigeneTdle R1 R2R3R4R5R6R7R8 

100095 Z9717t Hs.784S4 myocnin: trabecular msshwork Inducible 40.20 

100115 NMJXQ084 Hs.336920 glutathione peio)ddase 3 (plasma) 3-46 

100138 UB3508 H5Ji483 antfopoletlnl 2.30 

30 100299 D49493 Hsi171 growth diffwenli^on factor 10 11.00 

100306 U86749 >ts.80598 transcripGondongatlon factor A (Sll); 3.06 

100447 NM_014767 Hs.74583 KIAA0275 gene product 3,16 

1004SB S74019 Hs.247979 VpreB 42.40 

100862 AA00S247 Hs.2a5754 Hepatocyte Growth Factor Receptor 

35 100959 AA359129 Hs.118127 

101032 BE2]6854 Hs.46039 

101081 AF047347 Ks.4B80 

101088 X70697 Hs.SS3 

101125 AJ25D562 Hs.82749 

40 101180 U11874 Hs.846 Interteukln 8 receptor; beta 

101308 L41390 'Homo sapiens core 2 beta-I.O-N-ace^rtgl 

101330 L43821 Ks.80261 enhancer of filamenlatlon 1 (cas4il(e do 

101345 NM_00S79S Hs.1 52175 (^citonin receptor-Bie 

101346 Ai738616 Hs.77348 hydroi^proslaglandlndehydrxigenase 15-(N 
45 101397 M26380 ^.180878 lipopraiain Dpase 

101414 NMJ0OOO66 Hs.3a069 complementconipanent 8; beta polypeptide 

101435 NM-001100 Hs.12a8 acdn; aipin 1; skeletal muscle 

101507 Xiesge tts.82112 Inbiteiddn 1 recepton type 1 

101530 M29874 Hs.1360 cytochrome P450; subfamily IIB (phenobar 

50 101537 A14690S9 Hs.ia4915 zinc finger protein; Y-Jinked 

101542 NM_000102 Hs.1363 cytixhrome P4S0; subfamily XWl {steroid 

101545 BE246154 HS.1S4210 EDG1; endotheli^ (fifferentlalian. sphln 

101554 BE207611 Hs.12307a thyroid sBmulatinghonnone receptor 

101560 AW9S8272 Hs.83733 Intercellular adhesion molecule 2, exon 

55 101574 M34182 HS.15B029 protein Mnase; cAMP-dapendenl; catalyli 

101605 M37g84 Hs.1ia845 troponin C dow 

101621 8E391804 H$.62661 guanyiate binding protein l. lnterferon- 

101680 AAa9330 Hs.1042 Sjogren syndrorra antigen At (52kD;ijbon 

101829 AW452398 Hs.129763 solute canler family 8 {sadiumfceldum 

60 101842 M93221 Hs.75182 mannoserecepton 1 

101961 AW004O56 Hs.168357 "Hs-TBX2=T-box gene {T-box region) piuma 

101994 T92248 tts.2240 uteroglobin 

102020 AU077315 HS.1S4970 bansaiplion factor CP2 

imai BE2aog01 Hs.8315S aWaiyde dehydrogenase 7 

65 102112 AW02S430 HS.1S5591 IbrMiead boxFI 

102190 AA723tS7 Hs.73769 fblate receptor 1 (adull) 

102202 NM_000507 Hs.S74 fhidose-blsphosphalasa 1 

102241 NM_007351 H5.268107 Multlmerin 

102310 U33839 Accession not listed h Genbank 

70 102397 U41898 'Human sodium cotransporlarRKSTI mRNA, 

102571 U6011S Hs.239069 'Homo sapiens sketetalnuiscbUMintel 

102620 AA376427 Hs.121513 Human ckjnBW2< mRNA Ihomcli^ " 

102636 U67092 

102667 U70867 Hs.83974 

75 102875 U72512 Hs.7771 

102698 M18667 Hs.1867 progasbicsin (pepsinogen < . 

102727 U79251 Hs.99902 oplold4)indingproteinA»ll adhesion mol 

102852 V00571 Hs.75294 corticotropin releasing hormone 

103026 X54162 Hs.79386 thyroid and eye muscle autoanllgen 01 (6 

80 103028 X543aO Hs.74094 pregnancy-zone protein 

103098 M88361 Human mRNA fbr T cell neceptor; done IG 

103117 X63578 Hs.295449 parvalbumin 

103241 X76223 H.sa{dens MAL gene exon 4 

103280 U84722 Hs.76206 Cadherin5,VE-cadhei1r ' 

85 103360 V16791 Hs.730a2 kerab'n: hdr, addk; S 
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103496 Y09267 Hs.132821 flarin containing monooxreenase 2 S.97 

103508 Y10141 "asapiensDATI gene, partial. VNTR' 3^ 

103561 NM_001843 Hs.143434 contacflnl 140 

103569 NM_005512 Hs.151641 gtycoproleinArepefieonspredominanl 2.99 

5 103575 Z2B256 'H.sapens isofbrnt 1 gene Ibr L-type cal 4.18 

103627 248513 H.saplensXGmRNA (clone PEP6) 3.44 

103767 BE244667 HS.2961S5 CGI-IOOproldn 2-2S 

103850 AA187101 Hs.213194 Hypottiefical protein MGC10895:sim to SR 46.55 

104078 M402801 HsJ03276 ESTs 3.05 

10 104326 AW732858 Hs. 143067 ESTs 3.54 

104352 BE219898 Hs.173135 dual-spedScily tyrosIn8^Y)■phosphoIyl 3.16 

104398 AI423930 Hs.367g0 ESTs; We^ similar to putative p150|H 64.80 

104473 A1904823 Hs.31297 ESTs 3.38 

104493 AW960427 Hs.79059 ESTs; Moderately sTmllar to TGF-8ETAREC 2.47 

15 104495 AVW5687 Hs.292979 ESTs 28.60 

104S95 A1799603 Hs.271S6a ESTs 3.42 

104597 AI364504 Hs.g3987 ESTs; Wbaktyslndarto 911-1 prolyl S.00 

104659 AW989769 Hs.105201 ESTs 34.00 

104686 AA010539 Hs.18912 ESTs 

20 104691 U29690 HsJ7744 ESTs; E 

104764 AI039243 Hs.278S85 ESTs 

104776 AA026349 ESTs 

104825 AA035613 Hs.141883 ESTs 

104865 T79340 Hs.22575 Homo sa^ cDNA: FU21042 Us. done C 

25 104942 NM_016348 Hs.10235 ESTs 

104989 R65998 Hs.285243 ESTs 

109362 AWg543S5 HS.36S29 ESTs 

105101 H63202 Hs.38163 ESTs 

105173 U54617 Hs.8364 ESTs 

30 105194 R06780 H5.19800 ESTs 

105226 R589S8 Hs.26608 ESTs 

10S256 AA430850 Hs.16529 transmembrane 4 superfamily member (telr 

105394 BE24S812 Hs.8941 ESTs 

105647 Y09306 Hs.30148 homeodomain^nteracllng protein kinase 3 

35 105789 AF106941 Hs.1B142 atresfin; beta 2 

105817 AA3g7825 synaptopodin 

105847 AW964490 Hs.32241 ESTs 

105894 A1904740 Hs.25691 caldlonin receptor-like receptor adM 

105999 BE268786 Hs.21543 ESTs 

40 106075 AA04S290 HsJ25930 ESTs 

106178 AUM9935 Hs.301763 K1AA0S54 protein 

106381 AB040916 Hs.24106 ESTs 

106467 AA450040 Hs.154162 ADP-ra>osylalionfecloMII(82 

10K36 AA32g648 H5.23804 ESTs 

45 106569 R2090g Hs.30O741 sorein 

lOffiOS AW772298 Hs.21l63 Homo sapiens mRI«;cDNADKFZji5B4Bfl76(fr 

106842 AF1242S1 Hs.26054 novel SHZcontaljUng pn^ 3 

105844 AA4850S5 Hs.1 58213 spsmi assodaied an^ 6 

106870 Aig83730 Hs.26530 serum depdvadon response (phosphaSdyl 

50 106943 AW888222 Hs.9973 ESTs 

106954 AF128847 Hs.204038 ESTs 

107106 AA862496 Hs.28482 ESTs 

107163 AF233588 Hs.27018 ESTs 

107201 D20378 Hs.30731 EST 

55 107238 D59362 Hs.330777 EST 

107376 1190545 Hs.327179 solute carrier (aitdly 17 (sodium phospha 

107530 Y13622 Hs.8S087 latent transformlnBaiowBifector beta b 

107688 AW082221 Hs.60535 ESTs 

107706 AA015S79 Hs.29276 ESTs 

60 107723 AA015967 EST 

107727 AA149707 Hs.173091 DKF2P434K151 p«*h 

107750 AA017291 Hs.60781 ESTs 

107751 AA017301 HS.23S390 ESTs 
107873 AK000520 Hs.143811 ESTs 

65 107899 BE019261 Hs.B3869 ESTs;WeaMr8lnill8rtoflliALUSUBFAMI 

107994 AA036811 Hs.48469 ESTs 

107997 AIJ049176 Hs.82223 Human DMA sequence ftom done 141 H5 one 

108041 AW204712 Hs.61957 ESTs 

10B04B AI797341 Hs.165195 ESTs 

70 108338 AA070773 •aiiS3g11.s1 Stralageneia)roblast(#9372 

10B434 AA078899 "zm94b1.s1 SIralagane colon HT29 ^193722 

108447 AA079126 "zm92a11.s1 Stratagene ovarian cancer (# 

108480 AL133092 Hs.63055 ESTs 

„ 108499 AA083103 "znlbUsI Stratagene hNT neuron eH3723 

75 108535 R13949 Hs.226440 Homo sapiens done 24881 mRNA sequence 

1 085SO AAa84867 'znl Itasl SIraiagene hNT neuron (#93723 

108604 AAg34589 Hs.4g696 ESTs 

10^5 AW97233a Hs.283022 ESTs 

108629 AA102425 •zn24e6.s1 Stratagene neuroeplttieBuni NT 

80 108655 AAQ99960 •zm6Sc6.s1 Stratagene fibtoblast (#93721 

108756 AA127221 Hs.117037 Homo sapiens mRNA; cDNADKFZpa4N1 164 {f 

108864 AI733852 Hs.lgd9S7 ESTs 

108895 AL138272 Hs.62713 ESTs 

108921 AI568801 Hs.71721 ESTs 

85 1Q89S7 AA142S89 Hs.717^ ESTs 
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lOWOt AI05S548 Hs.72116 ESTs, ModeralBlr sioiilarto hedgehog-int ZS7 

109003 M147497 Hs.7ie2S ESTs 

109004 M1S6235 Hs.139077 EST 5.M 

109065 AA16112S Hs.252739 EST 10.00 

109250 H83784 Hs.62113 ESTsiWeaMystototoPHOSPHATlDYLErHA 3.44 

109490 AA233416 Hs.139202 ESTs 2.92 

10K10 Ai7gaB83 Hs.a7191 ESTs 2.40 

109578 F02208 Hs.27214 ESTs lOJM 

109601 F02695 Hs.311662 EST 40.80 

109613 H47315 Hs77519 ESTs 54.40 

109650 R31770 HS.23S40 ESTs 31.20 

109682 H18017 Hs.22B69 ESTs B.40 

109724 IM9899 Hs.127842 ESTs 29.40 

109782 AB020644 HS.1494S long fatty acyt-CoAsyiUhetase 2 gene 8.00 

109833 R79864 Hs.29889 ESTs 10.00 

109B37 HOOese Hs.29792 ESTs 6.49 

109977 T641B3 Hs.282982 ESTs 2.75 

109984 AI7g6320 Hs.10299 ESTs 107.00 

11014B H41324 Hs.31S81 ESTs; Moderately slnHlar to SYNTAXIN IB 2.22 

110271 H28985 Hs.31330 ESTs 3.48 

110280 AW874263 Hs.32468 ESTs 44.20. 

110420 R93141 Hs.184261 ESTs 32.00 

110578 T62507 Hs.11038 ESTs 2B.40 

110634 R98905 HS.35M2 ESTs 20.00 

110726 AW961818 Hs.24379 po^mvottsgfrgated channel: shaker- 4.1S 

110837 W3109 Hs.108920 EST8;WeaMyslinlla-iaseniaphoi1n F|H. 56.80 

110875 N3S070 HS.2&401 tumor nectosis factor (jloand) superfaml 3.13 

110894 R92358 Hs.66881 ESTs: MbdSrato^sliiilar to cytoplasmic S.33 

110971 A176009B Hs.21411 ESTs 44.60 

111023 AV655386 H5.7645 ESTs 32.40 

111057 T79639 Hs.14629 ESTs 17.14 

111247 AW058350 Hs.16762 Homo sapiens mRNA:cONADKFZpffi4B2062(r 4.SB 

111330 BE247767 Hs.18166 MAAOOTOprotah 



111737 H04607 Hs.9218 ESTs 

111747 AI741471 Hs.23666 ESTs 

111807 R33S08 Hs.18827 ESTs 

111B52 R37472 Hs.21S59 EST 

40 112045 AI372S8e Hs.8022 TU3A protein 

112057 R43713 Hs.2294S EST 

112214 AW148852 Hs.1673g8 ESTs 

112263 1^2393 HS.2S917 ESTs 

112314 AW206093 Hs.748 ESTs 

45 ■ 112324 R55965 Hs.26479 nmtA: systonvassoclatBd memlirane pida 

112362 AW300887 Hs.26638 ESTs; Weakly to CO20 receptor [H 

112380 H63010 Hs.5740 ESTs 

112425 AA32499B Hs.321677 ESTs; Weakly slmflar to IHl ALU SUBFAMI 

112473 R65993 Hs.279798 pregnancy spedSc beta-1-glycoprotein 9 

50 112492 N51620 Hs.28694 ESTs 

112541 AFD383g2 Hs.116674 ESTs 

112620 RS0552 Hs.29040 ESTs 

112623 AW373104 Hs.25094 ESTs 

112867 TD3254 Hs.167393 ESTs 

55 112894 T08188 Hs.3770 ESTs 

112954 AA928953 HS.6B55 ESTs 

113029 AWD81710 Hs.7369 ESTs; Weakly siralar to llll ALU SUBFAM 

113086 AA346839 Hs.2a91Q0 DKFZP434C171 pmtein 

113140 TS0405 HS.17S967 ESTs 

60 113252 NM.004469 Hs.11392 (Mbs Induced ^owthtaBlor (vascular en 

11^ Alfl21378 H&159367 ESTs 

113394 T81473 Hs.177894 ESTs 

113437 T85349 H8.15923 EST 

113454 AI022166 Hs.16188 ESTs 

65 113502 T8913Q ESTs 

113552 AI654223 Hs.16026 ESTs 

113645 TgS3S8 HS.3331B1 ESTs 

113691 T98935 Hs.17932 EST 

113706 AAD04693 HS.2S9192 ESTs 

70 113883 U89281 Hs.11958 oiddaft«3dphahydn)xyslerolddehydro 

113924 BE178285 Hs.170056 Homo sapiens niRI«;cDNAOKFZp586B0220(f 

114035 W92798 Hs.269181 ESTs 

114058 AKD02016 Hs.114727 ESTs 

114084 AA708035 Hs.1224a ESTs 

75 114121 H0S785 Ks.25425 ESTs 

114124 W57554 Hs.125019 Human lymphoM nuclear pnXdn (LAF-4) 

114275 AW515443 Hs.3Q6117 bteiteuHn 13 receptor. al|dia1 

114297 AA149707 Hs.173091 DKFZP434K1S1 protein 

114427 AAD17176 Hs.33532 ESTs; Highly SfmSar to Mb-l protein |H 

80 114449 AA020736 -zeeSbl 1 .si Soares nana Na>4HR Homo sa 

114452 AI369275 Hs.243010 ESTs, Itoderatdy Similar to RTCOJIUMAN G 

114609 AA079505 -211^35.81 SWlagane colon m29(S93722 

114648 AA101GS6 ■zn2Sbas1 Stabgene nauroepilheltan NT 

114731 BE094291 Hs.l55651 Homo safJensHNP-SbetamRNA tor hepatocy 

85 114762 AA146g79 Hs.288464 ESTs 
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25 



35 

40 



75 



85 



114776 


AA151719 


HS.9S834 
















MM 




115272 


AW015947 




ESTsrWeaMy similar to nypottiafical LI 






115279 


AW964897 


Hs^082S 








115302 


AL109719 










115365 


AVV976252 


Hs ^66331 








115559 


AL079707 


Hsi07443 








115566 


AI142336 












AF255910 




ESTs, Weakly sindar to (defnne not ava 






AA418S38 




ESTs; Highly similar to dJI 178H5.3 pisa 








AA466620 




Endomilcln2 






115949 












115965 


AA001732 


Hs.173233 








116035 


AA621405 


Hs.1 84664 








116049 


AA454033 


Hs.41644 








116081 


A1190O71 


HS.S5278 








116082 


AB0294g6 


Hs.59729 








116213 


AA292105 


HsJ26740 


leucine rich repeat {la FLU) Interacfin 


50.60 




116228 


AJ7S7947 


Hs^0341 


ESTs; Weakly simSarto (uReOn (Umusc 






1162S0 


N76712 


HS.44B29 


ESTs 






116419 


AI613480 


Hs.47152 


ESTs: Weakly similar to lesficular tekS 






116617 


D80761 


Hs.45220 








116784 


AB007979 


Hs.301281 


tenasdn R (reslriclin; ]aiiisin) 






116835 


N39230 


HsJa218 








116B70 


AB(^179 


Hs.9059 


KIAA0962 protein 






117023 


AVV070211 


Hs.1 02415 


ESTs 






117027 


AW085208 


Hs.1 30033 






Y17036 


H88908 


Hs.41192 


EST 






117110 


AA160079 


HS.17S32 


ESTs 






117209 


WBOII 


Hs.306881 


ESTs 






117325 


M23599 










117454 


^ags69 


HS.440S5 


ESTs 






117475 


N3020S 


Hs.93740 








117543 


BE219453 


Hs.42722 


ESTs 






117567 


AW444761 


Hs.44565 


ESTs 






117570 


N48649 


HS.445S3 


ESTs 






117600 


N34963 


H8.44676 


EST 






117730 


N45513 


Ks.48608 


ESTs 






117791 


N48325 


Hs.93958 


EST 






»7923 


N51075 


Hs.47191 


ESTs 






117990 


AA446167 


HS.4738S 








118224 


N6227S 


Hs.48503 


EST 






118244 


N62S16 


Hs.48556 


ESTs 






118357 


AL109667 


Hs.1 241 54 


Homo sapiens mRNA fiji lenglh lns«t cDN 






118446 


N66361 


Hs.269121 


ESTs 






118447 


N66399 


Hs.4gig3 


EST 






118530 


N67900 


Ks.1 18446 


ESTs 






118549 


N68163 


Hs.322954 


EST 






118823 


W03754 


Hs.50813 


ESTs; Weakly sbnBar b long chain fatty 






118862 


W17065 


Hs.54522 


ESTs 






118935 


AI979247 


Hs.247043 


KIAA0525 protdn 






118944 


A1734233 


Hs.226142 


ESTs; Weakly amSar to !!<l ALU SUBFAMI 




14.00 


118995 


N94591 


H$.323056 


ESTs 




119073 


BE245360 


Hs.279477 


ERG-2«^G-1; V-els mm eiythroblastosi 






119268 


T16335 


H5.6532S 


EST 


31^ 




119514 
119824 


W37937 




Accession not listed in Geni)ank 






W74S36 


Hs.184 








119831 


AL117664 


Hs.58419 


DKFZP586L2024 protein 






119861 


W78816 


Hs.49943 


ESTs; Moderately dmiarto III) ALU SUB 






11S889 












118321 


VV86192 


HS.S8815 




















120094 


AA811339 


Hs.124049 






120132 


W57554 


Hs.125019 


Human (ymphold nuctear protan (LAF4) 




1 00 


120378 
120404 


AA223249 


Ks.^572a 
Hs.96427 


ESTs 

KIAAt013pratatn 
ESTs 


39.40 




120504 
120512 


AA2S6837 
N55761 


Hs.194718 


ESTs 






120667 


AA287740 


Hs.78335 


ndcratubule-assoclated pmteln; RP/EB b 






120777 


AA287702 


Hs.10031 


KIAA0955 protein 






121082 


AA338722 




ESTs 


41.60 




121191 


AA4002(e 


Hs.104447 


ESTs 
















121363 












121386 


AI743515 


KS.2S274 


ESTs 

ESTs; Moderately dmflar to putative sev 






121518 












121545 


M412442 


Hs.98132 


ESTs 






121622 


AMI 6931 


Hs.126085 


ESTs 




9.00 


121665 


AMI 6556 


Hs.98234 


ESTs 






121709 


AI338247 


Ks.98314 


Hotno sapiens mRNA; cONA DKFZpS86L0120 (f 


34.80 




121730 


AJ1408S3 


Hs.98328 


ESTs 


38.80 




121740 


AA42113a 


Hs.98334 


EST 






121772 


A1S90770 


Hs.1 10347 


Homo ssi^ensinraMfardphainlegrin bin 


36.20 




121821 


AL04023S 


Hs.3346 


ESTs 
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121835 AB033030 Hs.3O067O ESTs 2.34 

121841 AA427794 Hs.104a64 ESTs 2.S1 

121885 AA9348a3 Hs.98467 ESTs i2S 

121888 AA426429 Hs.98463 ESTs 2.92 

5 121938 AA428659 Hi98610 ESTs 46.80 

121950 AMZ9515 EST 31.40 

122030 AA431310 Hs.98724 ESTs 34.40 

122054 AA431725 Hs.98746 EST 3.58 

122211 AA300900 Hs.98849 ESTs; Moderately similar to bithoraxi^- 49.40 

10 122233 AA4364SS Hs.98a72 EST 29.80 

122247 M436676 Hs.98890 EST 39.80 

122253 AA436703 Hs.104936 ESTs; WaaH/amnartohy 

122266 AA436B40 Hs.98907 EST 

122285 M436981 Hs.121602 EST 

15 122409 AA446830 Hs.99081 ESTs 

122485 AA524547 Hs.160318 phospho 

122697 AA420683 Ks.96321 Hofira sapiens cDNAaJ14103Gs. done MA 

122772 AW117452 Hs.99489 ESTs 

122831 AI857570 Hs.S120 ESTs 

20 122913 A1638774 Hs.105328 ESTs 

123049 BE047660 Hs.211869 ESTs 

123076 AI34S569 Hs.l90046 ESTs 

123136 AW45199g Hs.194024 ESTs 

1233M N52937 Hs.102879 ESTs 

25 123455 AA3S3113 HS.11Z497 ESTs 

123891 AA609S79 Hs.112724 ESTs 

123758 AAS09971 Hs.112795 EST 

123802 AA620448 Homo sapiens done 24760 niRNA sequence 

123837 AI807243 Hs.1 12893 ESTs 

30 123844 AA938905 Hs.120017 olbctwyteceplor^OirTiSubfantiy 

123938 NM_004673 Hs.241St9 ESTs 

123987 C21171 HS.9S497 ESTs; Weddysimllarto GLUCOSE TRANSPOR 

124013 A1521936 Hs.107149 ESTs; WaaHysknllartoPT&ASSQCIATEOS 

124160 R40290 HS.12468S ESTs 

35 124205 H77570 Hs.108135 ESTs 

124226 AA618S27 Hs.1S02% ESts 

124246 H67680 Hs.270962 ESTs 

124348 AI796320 Hs.10299 ESTs 

124358 AWD70211 HS.10241S >(35g11.s1 MortonFetdCoditeaHomosa 

40 124409 A1BU166 Hs.107197 ESTs 

124442 AWS6^ HsJZBSBZ TATA box binding |»oteia{TBr^odale 

124468 N51413 Hs.109284 ESTs 

124479 AB011130 Hs.127436 calduni diannd; vdtage.<iepsndenl: dph 

124519 AI8700S6 Hs.137274 ESTsiVtfeaMysImnartoSPUGEOSOMEASSO 

45 124711 NM.0a4657 HS.2M30 semmdepWaBon response PosphaHdyl 

124868 AI768289 Hs.304389 ESTs 

124874 BES50182 Hs.127826 ESTs 

125097 AW576389 Hs335774 ESTs 

125179 AW2D8468 Hs.103118 ESTs 

50 125200 AW838531 HS.1031S6 ESTs 

125299 T32982 Hs.ia272D ESTs 

12S4(X> AL1 10151 Hs.128797 OKFZP586O0824 protein 

125810 H00083 a^l hydiocariMn receplor-lnleiadtng pr 

126176 BE242256 Hs.2441 KIAA0022 gene product 

55 126303 D78B41 HUM52SA0SB»faman placenta polyA+CTFif 

126403 AW629054 Hs.125976 ESTs; WsaMy similar fDmetanDpreteasa^ 

126S07 AU140137 Hs.23g64 ESTsiWdaldyslndtvtoHCI ORF[M.muscu 

126773 AA648284 HS.187S84 ESTs 

127307 AW862712 Hs.126712 E5rs;WeaMystniiIartDplL2hypothefica 

60 127462 AA760776 Ha.293977 aa59b04.s1 NCLCGAP.GCBI ttono sapiens c 



127S72 AAS94027 Hs.191788 ESTs 

127609 XSOOSI Hs.5^ ESTs 

127832 AW976035 Hs.292396 ESTs 

65 127898 AA77472S Hs.1 28970 ESTs 

128073 AW340720 Hs.12S9a3 ESTs 

128101 AAg05730 Hs.128254 ESTs 

128149 NM_012214 Hs.177576 inannasyl{alpha-1;3-)^lycapraieinl)eia- 

128212 WZ7411 HsXSSX gliitaMoneperaiddase3(plasni4 

70 128333 W68800 Hs.12126 ESTs; WeaidysiBiliartoLR8|H.sapiensI 

128364 N76462 Hsi691S2 ESTs; Wteaidysindlar to ZINC RNffiRPROT 

12B426 AI265784 H3.14S197 ESTs 

128598 AA30S407 H5.102308 potass'unilnwanll^recfi^ channel; s 

128634 AA464918 ESTs; Moderately slmJar to IK ALU SUB 

75 128687 AW271273 Hs.23767 ESTs 

128726 AI311238 Hs.104478 ESTs 

128773 NHL004131 Hs.1<B1 gianzymBBteraw^2;cytotnxioTJynip 

128833 W26667 Hs.184S81 

128870 H3S537 Hs.75309 

80 128878 R25513 Hs.lora3 ESTs 

128885 AF134803 Hs.180141 cofirin2(inusd^ 

128998 WW245 Hs.107761 ESTs: WeaMysimliarto PUTATIVE RHQ/RAC 

1^00 AA744902 Hs.1 07767 ESTs; Modaralriysir«tebCrf*Mltnhl 

129038 AW1S6903 IHs.108124 ritmsomal protein U1 

85 129098 AWS80945 Hs.330466 ESTs 
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129210 AU)39940 Hs.202949 

129240 AA38125B H&237868 

129262 SE222198 Hs.10g843 

129301 AF182277 Hs^0780 

129331 AW167668 Hs.279772 

129381 AW245805 Hs.110903 

129565 X77777 Hs.198726 

129595 U09550 Hs.1154 

129613 AW978517 Hs.172847 

129782 AW016932 Hs.104105 

129950 R>77B3 Hs.1389 

129958 R27496 Ks.1378 

129959 ALJ038554 Hsi!74463 
130180 AA305688 HS.26769S 
130259 NM-000328 Hs.1536M 
130273 AW972422 Hs.153863 
130312 AF056195 Hs.15430 
130«6 NM.001928 HS.1S5597 
130523 AA999702 Hs.214507 
130799 AB02894S Hs.12686 
130885 NM_0OS883 Hs.20912 
131002 ALOS0295 Hs.22039 
131012 AUJ39940 HS.202949 
131031 NM_001650 HS.2886S0 
131061 N64328 Hs.268744 
131066 AW169287 H3.2258a 
131082 AI091121 Hs.246218 
131087 AF147709 Hs.22824 
131161 AF033382 Hs.23735 
131179 AA171388 Hs.184482 
131182 Aie24144 H&23912 
131205 NM_003102 Hs.242J 
131277 AA131466 Hs^3767 

131281 AA251716 Hs.25227 

131282 XD3350 
131285 AI567943 
131355 R52804 HS.25955 
131391 AW0aS781 Hs.26270 
131461 AA992841 Hs.27263 
131487 F13036 Hs.27373 
131517 AB037789 Hs.263395 
131545 AL137432 Hs.28564 
131583 AKfl00383 Hs.323092 
131647 AA359615 HsJ0089 

131675 H15205 H&30509 

131676 AI126821 
131708 S60415 
131717 X94630 
131756 AA443966 Hs.31595 
131762 AA7449D2 Hs.107767 
131821 AA017247 Hs.164577 
131839 AB014533 Hs.33010 
131861 AU395858 Hs.1 84245 
13a>15 A1418006 Hs.3731 
132070 BE622S41 Hs.38489 
132242 AA332657 Hs.42721 
132334 AW080704 Hs.45033 
132476 ALl 19844 Hs.49476 
132480 NM_0O1290 Hs.4980 
132533 Ai922988 Hs.172510 
13K98 X80031 Hs.530 
13^19 H28855 Hs.S3447 
13Z52 ^M1739 Hs.61260 



RS1604 Ns.300842 

BE384g32 Hs.64313 

NVL003278 Hs.65424 

AA428580 Hs.655S1 

AA026S33 Hs.66 

NI«L0140S1 H3.94a96 

AA903424 Hs.6786 

AW978439 Hs.69S04 

AJ131245 Hs.7239 

AF017987 Hs.7306 

AL134030 85.264180 

U41518 Hs.74602 

BE143455 Hs.75415 

NM_001872 Hs.75572 

TSB4fi6 HS.222S66 

AF035718 Hs.7805t 

L34657 Hs.78146 

AW175787 HSJ34841 

AI372588 Hs.8022 

AA285136 K5.301914 

Af873257 H8.7994 
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Human eylochrame P450-IIB PB3) mRNA: 

ESTs; H'^hly sbnDar to CGI-38 protein ( 

daudin 5 (Iransmembrana prole'm deleted 

vasoactiya InlesSnat paplids mteptor 1 

oviducts glycoprotein 1; 120kD 

ESTs; Weakly s'nralar to collagen aipha 1 

EST 



defensin; 4pha 1; myelald^laled seque 

UDP-GaltetaOcNAc beta l:3^ajactosyKr 

telinifis pjgmentosa GTPasa regulator 

MAD (molhera against decapentaplegic; Dr 

DKFZP58661219 protein 

0 eoiriponent or complenwnt (adipsin) 

ESTs 



42^ 
51.60 



ibeaRIike 



ESTs; Vbderately similar b KIAA0273 (H. 
ESTs 

ESTs; Weatdy similar to zinc linger prot 
ESTs: VfeaUy similar to p160 nyMiindbig 



KS.30S14 
Hs.30941 
Hs.3107 



ESTs 

ESTs 
ESTs 

alcohol dehydrogenase 3 (dass I); gamma 
ESTs; Moderately similar to putative sev 
DKFZP564D206piDtdn 
ESTs 

bulyraterasponsa factor 2 (EGF-response 28.80 
Homo sapiens mRNA; cONA OKFZp56401763 {f 
ESTs; H-#ty similar to semairitorin Via [ 39.00 
ESTs 

ESTs; VUaaMy sMarto dual spedlidty 

ESTs 

ESTs 

ESTs 45«) 



ESTs: Moderately stoiilarto Cam\ bdil 
ESTs 

KIAA0633pfDtdn 

KIAA0929 pratdn Ms)(2 hilBracSno midea 
ESTs 



133071 
133120 
133129 
133147 
133151 
• 133213 
133276 
133377 
133407 



133656 
133869 
133779 
133978 
133985 
134QQ0 
134111 
134185 
134204 



ESTs 
ESTs 
ESTs 

SEC24 (S. ceiBvlsiae) related gene fma 
secreted fiizzled^aled protein 1 
protocadherin 2 (cadtieritHBia 2} 
aquapoiin 1 (channeMbnring Integral pr 
Accession not Esled In Genbank 
cart)oxypeplidase B2 (plasma) 
ESTs 

transcr^tion factor 21 
plateleUendoMal ceU atSie^ molec 
selenium tdnifing protein 1 
TU3A protein 

Homo sapiens miW: cDNA DKFZp586K1 220 
ESTs: Weakly similar (0 CO^ protein [ 
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U33749 Ks.197764 












An29008 H5.333383 














N50465 Hs.92927 














AW796190 HS.93S78 


















■ 








135091 


AM93650 Hs!94^^ 


ESTs 










135135 


M775910 HS.9S0U 


syntropMn; 1 (dystaophbi-assadatB 




8.00 






135203 


C1S737 te.269386 


ESTs 








4.31 


135236 


AI638208 Hs.g6901 


ESTs 


43.00 








135266 


R41179 Hs.97393 


Human inRNA for KIAA0328 gene; partial ed 










1353« 


NM-0Q0928 Hs^2 


phosphoitiase A2; group IB (pancraasi 




3.82 






135378 


AW961818 Hs.24379 


potassium voltag&^aled channel: sh^- 




4.15 






135387 


NMJ»1972 Hs.g9K3 


olastase 2; cieutropMl 


37 JO 








135388 


W27g65 Hs.gg88S 


EST 


3&80 








135402 


L12398 Hs.g9g22 


dopamine receptor 04 








4.21 



TABLE 28 shouw ttie accession numbers lor those prlmekeys IxUng uniganelD's for TaUs 2A. For each probeset we have Dated Ow gene cluster number liom whidi Ihs 
otgonudeolides were designed. Gene dusters were campled using sequences derived fiomGenbank ESTs and mRNAs. These sequences were dustered based on sequence 
sinilanty using Clustoring aid Alignment Tods PouUeTvrisi, Oskiand Cafifamia). The Genbank accession numbers for sequences comprising each cluster are listed In the 



CAT number Gene cluster number 



Pl«y CAT number Accesstons 

108447 43452.-7 AA07912B 

108550 120073_1 

108655 12752^.1 AA099960AA113013 

102397 44371.-1 041898 

126303 1525933.1 078841 D78880 

125810 1554054.1 HOQ083 R81082 

103627 261SL.2 Z48513 Z48512 

121366 280401.1 AI743515AA40S617AWZ767D6 

114609 116777J AA079S05AA079S37 

11S272 172113.f AW01S947AA211890AA27942S 

108338 112188.1 AA070773AA070774 

108434 114012.1 AA0788g9AA078782AA07S7a8 

123802 sanbanlLAA62044a AA620448 

102310 NOT_FOUND_enlrez.U33839 U33839 

102636 entrez_U67092 U67092 

104776 geflbank.AA026349 AA026349 

120504 genbailLAA2S6837 AA2S6837 

113502 genbaidLT89130TB9130 

108499 senbanKJM083103 AAOSSIOS 

101308 entnszj.41390 U13g0 

108629 genbanlUiiA102425 AA10242S 

103098 221.215 M86361 Z26S93)(02B50O13070AB1006S9M17649^O7869lm787t X61077M16286Aroi8169X61079S59351 X60142AF04316^ 

103241 enlrBzJ(76223 X76223 

103508 enhBZ_Y10141 Y10141 
103575 
119514 
121082 
128634 

105817 genbank.AA397825 AA397B25 

12IS18 aenbank>A412155 AA41215S 

114449 gsnbankJW\020736 AA020736 

114648 genbanleAA101056 AA1010S6 

121950 genbanl^AA429S15 AA42951S 

107723 ganbanlU^1SgS7 AA015967 
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T^le 3A shows 452 genes up-iegulaled in dironicaify diseased tung 
such as fibrosis, emphysema, and broichitis. These genes were seleeted ftoni 5g€80 prabesels ( 
probesel obtained from this analysis was expressed as ~ 

Uiuque Eos \ 
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Chroncaliy diseased king samples represent chronic no»flialignant lung diseases 
' cntheEos/AflyniebixHiiOaGanedripairay. Gene axprssslon data far each 



mlWA expression. 



ExAccn: 

UnigenelD: Unigene number 
UnigeneTifle: Unigene gene title 

R1: 80lh percentile of Al for etironlcally diseased lung samples dimded by the 90th percenOe of Al fornormaliung samples. 

fO: 80th percenfile of Al for chronically diseased lung samples divided 1^ the 90th percentile of mrnsi king samples, squamous ceB carcinomas and 

adenocarcinomas 

R3: 70th peicenine of Al for chronically diseased lung samples minus 8ie 15th percantlls of Al for dl imnial lung, chronically diseased king and himor samples 

divided by the 90lh percentile of nomial lung samples, squamous cell carcinomas and adenocarcinomas minus the 15th patoentae of Al for sH normal lung, 
chronically diseased lung and tumor samples 



ttey 


ExAccn 


UnlgenelO 


Unigene TMe 


135423 


US0531 


Hs.1 38751 


Human BRCA2 region. mRNA sequence CG03 


135378 


AW961818 


Hs.24379 


MUM2 protein 


13S346 


NM_000928 


Hs.992 


phospholipase A2, group IB (pancreas] 


135235 


AW298244 


Hs.293507 


ESTs 


133)57 


U90268 


Hs.93810 


cerebral cavernous maKbnnatlons 1 


1349S1 


BE305081 


Hs. 169358 


hypolheiica) protein 


134799 


M36821 


H5.89690 


GR03 oncogene 


134786 


TS618 


Hs.89640 


TEKtyres'ne Knase, endothelial (venous 


134772 


NM_000829 


Hs.163697 


glutamate receptor, ionotrophic, AMPA 4 


134752 


BE246762 


Hs.89499 


arachldonata 5-l!paxygsnase 


134749 


T28499 


HS.8948S 


carbonic anhydraso IV 


134696 


BE328276 


Hs.8861 


ESTs 


134636 


NM_0D5562 


KS.8720S 


lymphocyte anfigen 64 (mouse) homolog, r 


134627 


AI018768 


Hs.12482 


glyceroneptnspiiate 0-acyltransleiasa 


134622 


AW975159 


Hs.293097 


ESTs, WeaMy similar to A5538I> faciogeni 


134570 


U66615 


Hs.f72280 


SWI/SNF lelaled. matrix assoclaled, acti 


134S61 


U76421 


Hs.85302 


adenosine deaminase, RMA-specific 81 (h 


134468 


NIIL001772 


Hs.83731 


CX133 anGgen (gp67) 


134417 


NMJ)08416 


H$.82g21 
HsJBmS 


solute canter family 35 (CMP-siaSc ad 


134343 
134323 


D50683 
BE1706S1 


HS.S700 


ddstedfai Ever cancer 1 


134300 


NM_001430 
AW580939 


Hs.8136 


endoBwQal PAS domain protein 1 


134299 


H3.g7199 


complement component Clq receptor 


134253 


XS2075 


Hs.80738 


sialophorin (gpLI IS. leukosialln. CXI43) 


134182 


052059 


Hs.7972 


KIAA0871 protein 


133985 


L34657 


Hs.78146 


platdet/endothelial cell adtiesion molec 


133978 


AF035718 


Hs.78061 


transcription factor 21 


13M35 


A1677897 


Hs.76640 


RGC32 protein 


133651 


AI301740 


Hs.173381 


d!hydropyrimid!nas8-IB(e 2 


133633 


D21262 


Hs.75337 


nucleolar and coJad-body phosptiprotein 


133565 


AW955776 


Hs.313500 


ESTs, Utoderately simnarto ALU7_HUMAN A 


133548 


AWg46384 


Hs.178112 


DMA segment, single copy prabe IMS-CAIA. 


133488 


AA33S295 


Hs.74120 


adipose spedfic 2 


133478 


X83703 


Hs.31432 


cardiac ankyiin repeat protein 


133337 


AF085983 


Hs.293676 


ESTs 


133200 


AB037715 


Hs.183639 


hypothefical protein FU10210 


133153 


AF070592 


Hs.66170 


HSKM-B protein 


133130 


AI128606 


KS.6SS7 


zinc fiigerpratelfllSI 


133120 


NM_003278 


Hs.65424 


tetranectin (plasnjnogen^ndlng protein 


132928 


AW16B082 


H3.169449 


protein \iwsa C, alplui 


132836 


AB023177 


Hs.29900 


K1AA0960 protein 


132799 


W73311 


Hs.1 69407 


SAC2 (suppressor of actln mulaOons 2, 


132742 


AA025480 


HsJJ92812 


ESTs, WeaWy similar to T33468 hypotheti 


132548 


X12830 


Hs.193400 


intaileukin 6 reenter 


132476 


AL119844 


Hs.49476 


Homo sapens done TUAB CiMm*al regi 


132439 


AK001942 


H8.4863 


hypothetical prot^ DKFZpS66A1524 


132240 


AB018324 


Hs.42676 


KIAA0781 prot^ 


132210 


NMJ)07203 


Hs.42322 


A Idnase (PF?KA) anchor pralein 2 


132199 


AL041299 


Hs.165084 


ESTs 


131751 


T96555 


HS.31S62 


ESTs 


131745 


AI828S59 


HsJ1447 


ESTs, Moderately sMar to A46D10 X-l 


131694 


NM_000246 


Hs.3076 


MHCcjass II transactive 


131686 


NiiL012296 


HsJ0K7 


Gf»2~3ssoda{ed Unding protein 2 


131676 


AI126821 


HsJ0514 


ESTs 


131629 


245794 


Hs.238809 


ESTs 


131589 


CISffiS 


Hs.29191 


eidlhelU membana pnMn 2 


131536 


AA019201 


Hs.2692ia 


ESTs 

sema domain, transmeiribiane domain (IM). 


131517 


AB037789 


Hs.263395 


131355 


R52804 


Hs5S956 


DKFZP564O208pn)Wn 


131253 


R7ia02 


Hsi4a53 


ESTs 


131207 


AF1042S6 


Hs.24212 


latrapliilln 


131156 


AI4722D9 


Hs.323117 


ESTs 


131066 


AW169287 


Hs.22588 


ESTs 


131061 


N64328 


Hs.268744 


raAA1796 protein 


131053 


AA348541 


Hs.29626t 


guaniiw nudeolide Undng protein (G pr 


130895 


AA641767 


HsilOIS 


hypothetical protdn DKf=ZpS64lJ0884 siml 


130762 


D84371 


Hs.1898 


paracnGonase 1 
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13.20 
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AVV337575 


Hs 201591 


ESTs 






A1831 952 




cysteine^ protein 1 (intesOnaQ 










OKFZP434H204 protein 








Hs 182611 


solute earner family 11 (piotorH»upIed 
integrin, alpha 1 










eulcayofic translalion InHiation fador 


11.60 




AVV972422 


Hs.1 53663 


MAD (motlieis against decapenlaplesic. Or 






NM_0Q0328 


Hs.1 53614 


retmiEs pjgmentosa GTPase regulator 








Hs.1 32330 


zinc finger protein 38 (KOX 18) 


21.20 








annexinA3 










ESTs 






AA181018 




hypolhefical protein FU13920 


18.60 




AB007899 




homolog of yeast ul)iquiGn-protein Ugas 










ferritin, l^tpolypepfiits 










Homo sapiens cONA FLJ12566 fo, clone NT 


22.63 








Homo sapiens mRMA; cDNA OKFZp586U)1» (f 










vasoactive intestinal peptide receptor 1 








Hs^0847 


ddl»tubuiin 


39.20 








ESTs 










Rag C protein 


15.20 








spondyloeidphyseal dysplasia, late 


12.40 


129312 


T97579 


Hs.110334 


ESTs, similar to 178885 serineAii 


20.83 






Hs^37868 


Inlerteulan 7 receptor 






AL039940 




KIAA11Q2pratein 










nudix (nudeoside dipliosph^ linked moi 










COW52 antigen (CAMPATH-1 antigen) 








Hs 107318 


kynurenlne S-mmiooxygenase (kynurenlne 3 






Ar015525 


Hs .302043 


diemoMne (C-C moSQ receptor-taie 2 






AW368576 


Hs.1 39851 


caveor[n2 








Hs.1 86709 


ESTs. Weakly similar to 138022 hypolliet 


12.20 




AWj60432 




cranJo^da) dsvelopinent protein 1 
KIAA1080 protein; Galgi-associated, eamm 


26.40 


128624 


BE1S47es 


Hs!l02647 


ESTs, Weakly similar to TRHY.HUMAN TRICH 






NM-003616 


Hs.1 02456 


survival of motor neuron protein Interac 


16.00 




NM-00491S 


Hs.1 0237 


ATP-blnding cassette, suti-family G (WHIT 


1Z80 




AA305407 


Hs.1 02303 


potassium Inwardly-redi^ng channel, s 










ESTs 






API 50882 


Hs.1 868 77 


sodium cliannel, voltage-gated, ^pe XII, 


17.20 




AA630201 


Hs.! 24347 


ESTs 


21.30 




AI302471 


Hs.1 24292 


Homo sapiens cONA: FU23123 fis, done L 






AI557081 


Hs.262476 


S-adenosytmelhionine decarbo^lase 1 


10.60 




AI6695B6 


Ks.3626 


mftigeiMdivaled protein kinase kinase 
ESTs 


13.40 




AA7618Q2 


Hs.291 559 


ESTs 


14.00 




AA836641 


Hs.1 63085 


ESTs 


14.00 




AW293496 


Ks. 180138 


ESTs 


11.00 




AI240t02 


Hs.322430 


NDRG famiiy, member 4 


11.10 








collagen, type IV, alpla 3 (Goodpasture 






AA9u8954 




ESTs 


19.60 




AK000767 


Hs. 157392 


Homo sapiens cONA FU20760 lis, ckine CO 


15.40 








ESTs 


17.50 






Hs^7Q224 


ESTs 


14.60 


127398 


L31988 


Hs.187991 


DKFZP564A122 protein 


15.40 




AA442797 




ESTs. Weakly simitar to 138022 liypotliet 


14.60 








DnaJ (Hsp40) tiomolog, subfamily B, mmnbe 


21.00 




BE047653 


Hs.11 9153 


ESTs, Weakly similar to ZN91JIUMAN ZINC 


15.80 






HS.12G712 


ESTs. Weakly similar to AF191020 1 E2IGS 








Hs,181301 


cathepshS 


22.60 


127167 


AA62S6gO 


Hs.190272 


ESTs 


21.40 








ESTs 


41.20 








ESTs 


11.00 




AF137388 




piasmoTipin 

gb:zu68c0lrl SoaresJesfs.lWT Homo sap 










gbMsg2228.seq.F Human fetal heart. Lamb 


12.20 




A6037o6u 




nuclear factor 1/A 


17.19 






nS.151999 


ESTs 


13.57 




AAjioifli 


Hs.61635 


six transmembrane epithelial anSgan of 


15.40 








Homo sapiens cONA: aJ22783 as, done K 
membiafte-assodated nucleic add binding 


18.00 








gbf ST28707 Cerebellum K Homo sajiens c 


16.77 




AVU9791S5 


Hs.298275 


airino add transporter 2 


14.60 






Hs.13649 


Novel human gene mapptag to chomosome 13 










ESTs 


13.40 




AW752782 




hypothetical protein mi0546 
ESTs 


18.20 








14.00 


126077 


lil^8772 


Hs!210836 


ESTs 


16.59 


125994 


AI990529 


H8.270799 


ESTs 


17.40 


12S934 


AA193325 


Hs.32646 


hypottisBcal prol^ inJ21S01 




12SB47 


AW161885 


HS.24S034 


ESTs 


49^57 


125831 


H04043 




gb7|4Sc03j1 Soares ptacenla NbW Homo 
ESTs 




125731 


R61771 


Hs.26912 


13.20 


125676 


BE612918 


Hs.151973 


hypothetical pnrtein FU2351 1 


U20 


125561 


F18S72 


Hs.22978 


ESTs. Weakly similar to ALtMJIUMAN ALU S 




125552 


H09701 


H1278366 


ESTs. Waakfy similar lo 138022 hypolheti 


12.80 


125489 


H49193 


Hs.124984 


ESTs. Moderately similar to ALU7_HUI^ A 


33.40 
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wo 02/086443 

125422 AAS03229 H5.153717 

AM22936 
125309 T12411 
125167 AL137540 
125139 AW194933 
125042 T7S906 
124711 NIUL004657 
124631 NM_014053 Hs270594 
124578 N68321 Hs.231500 
Hs.42322 
Hs.1 02670 
H5.11030 



Hs.161378 

HS.1B3745 

Hs.102541 

Hs5788 

Hs.269432 



ESTs 
ESTs 

hypoUietical protein FU13456 

hypothetical protein MGC10924 sin^ to 
ESTs. Modeialely similar to ALU1_HtAUN 
seniin deprivaSon response bdnsphafklyl 
avCRpiDteai 



PCT/US02/12476 



18^ 

21^0 

23^ 
21.43 



124472 NS2S17 

124438 BE178S36 

124357 N22401 

124306 AW973078 

124214 H58608 

124097 AVy29823S 

123978 T89832 

123972 T46848 

123961 A1JOS0184 

12393S NIVL004673 

123802 AA620448 

123734 AA609881 

123619 AA602964 

123596 AA421130 

123476 AA384564 

123340 AAS04264 

123190 AA4a9212 

123136 AW4S1999 

123073 M485061 

123055 AA482005 

122699 AA456130 

122679 AA811286 

122633 I4M_001546 HsJ4B53 

122553 AA451884 Hs.190121 

122544 AW973253 

122485 AA524547 



Hs.293039 
Hs.1 51323 
Hs.101689 
HS.17027B 
Hs.70337 
Hs.21610 
HS.241S19 

Hs.312447 



Hs.105228 
Hs.194024 
HS.10S6S2 
Hs.105102 
Hs.301721 
Hs.192837 



122211 AA300900 

122127 AW207176 

122011 AA431082 

121992 A186077S 

121989 W58487 

121835 AB033030 

121726 AF241254 

121690 AV660305 

121643 AA640987 

121633 AA417011 

121622 AA416931 

121497 AA412031 

121351 AW206227 

121314 Vro7343 

121242 AA4008S7 

121059 AA393283 

120934 AA226ig8 

120755 AA312934 

120637 AA811804 

120484 AA253170 

120336 N85785 

120266 AI807264 

120132 W57554 

120041 AA830882 

119996 W88998 

119970 AA767718 

119861 W78816 

119824 W74536 

119740 AWQ21407 

119271 AIC81118 

119221 C14322 

119126 R4S17S 

119073 BEJ45360 

118928 AA312799 

118901 AW292577 

118661 AL137554 

118807 AI377444 

118449 AI81386S 

116416 N66028 

118379 N64491 

118329 N83520 

118320 N634S1 

118253 AA497044 

118124 N56968 

118056 AB037746 

118032 N52S02 

117840 T26379 

117404 N3972S 

117314 N32498 



Hs.160318 
Hs.98849 
Hs.106771 

Hs.9eS06 
Hs.193784 
Hs.300670 
Hs.178098 
Hs.110286 
Hs.193767 
Hs.98175 
Hs.126065 
Hs.97901 
Hs.287727 
Hs. 182538 
Hs.97509 



EST 

meinbrajie-spannlng 4<toinains, sulifomlly A 

gb:yw37g07.s1 Morton F«tal CocMea Homo 

ESTs 

ESTs 

ESTs 

ESTs 

Immunogtobiiin superfamay. member 4 
OIQ=ZP434B203 protein 
angiopdetin-like 1 

gb:ae58c09.s1 Stratagene lung caicinoma 
ESTs 

gb:no97c02.8l Na_CGAP_Pr2 Homo saptona 
EST 



EST 
ESTs 
ESTs 

ESTs. Weakly sin^lar to reverse transcri 
KIAA1 255 protein 

ESTs, WesUy siinitar to ALUS_HUIMN ALU S 
Inhibitor of DNA binding 4, dominant neg 
ESTs 
ESTs 

FXYO domain-containing Ion transport teg 
ESTs, Moderately simlIarloAF151511 1 H 
ESTs 

gb2w78a10.sl Soares_tasas_NHT Homo sap 
ESTs 

Hmo sapiens mRNA; cDNA DKFZpS86K1 922 (f 
K1AA1204 protein 

angiotensin I converting enzyme (pepfidy 

ESTs 

ESTs 

EST 

ESTs 

EST 

hypothefical protein FU23132 



3U0 

14.40 

40.00 
15.40 



gb2t74e03j1 SoaresJesSs JIHT Homo sap 
gbJio26a07.s1 Na_CGAPJ»r1 Homo sapiens 
Homo sapiens cDNA: FU21326 lis, clone 
gb:ob39aa5.s1 t4CI_CGAP_GCB1 Homo sapiens 



ESTs, WeaMy Mar to T34036 hypotheU 



hypolhsficd protein FU10512 
ESTs, Wealdy slmOar to S65657 alpha-1C- 
advanced glycosyt^ end pnduc|.«pecl 
hypothetical proteh 



Hs.98473 

HS.18116S 

HS.20S442 

Hs.125019 

Hs.59368 

Hs^9134 

Hs.93581 

Hs.49943 

Hs.184 

Hs.21068 



HS.2SO700 tryptasebetal 

Hs.1 17183 

Hs.279477 
Hs,283689 
Hs.94445 
Hs.49927 
KS.S4245 
KS.1B4478 
HS.4910S 
HS.4S990 



21.20 
20.00 



protein UnaseNYD^PIS 
ESTs, Vfealdy ^iar to S65824 reverse t 
hypotheOcal protein aJ21939 similar to 



Hs.141600 

Hs.46707 
Hs.427a 
HS.47S44 



gbwOaOUl Soares_fnuitipIe_sclerosis_ 
ESTs. Weaidy simflar to allamatively s 
hypoiheticd protein aJ10392 
duomosocns 21 open reading frame 37 
J protein OKFZp761O01 13 



104 



35 
40 
45 
50 
55 
60 
65 



80 



Hs.102415 

Hs.301281 
HS.9S097 
HS.6193S 
HS.490S0 
Hs.82501 



Hs.202949 
Hs.15220 
HS.31S75 
Hs.172572 
Hs.173233 
Hs^198 
HS.33293S 

Hs^eso 

Hs.269908 
Hs.73251 
Hs.43977 
Hs.184411 
Hs^0825 
Hs.124232 
Hs.11387 
H&878SG 
H8.188717 
Hs.87491 



WO 02/086443 

117209 W03011 Hs.306861 
117023 AW070211 
116814 H5Ca34 
116784 AB007979 
116786 AI6086S7 
116712 AW901618 
116707 H10344 
1163S1 AL133623 
116279 AW971248 
116166 AL039940 
116152 AL04O521 
116117 BE613410 
116107 AL133916 
11S965 AA001732 
115955 AFa63613 
115844 

14S683 AF255910 
115673 AA406341 
11S672 AI889110 
11SS66 AI142336 
115313 AASOSOOI 
115279 AW964897 
115230 AA278300 
11S110 AK001S71 
114999 BE246481 
114930 AA237022 
114922 AA23S672 
114837 BE244930 
114769 AA14g060 
114761 M143781 
114736 A1610347 
114596 AA310162 
114S18 AW163267 
114455 H37908 
114452 AI369275 
114359 NM_016929 
114357 R41677 
114251 H15ffi1 
114138 AW384793 
114124 VW7554 
113946 AIAHI83883 
113695 T9SS65 

113606 »M_013343 Hs.278951 
113590 R4g642 Hs.142447 
113560 T9101S Hs^68626 
113552 AI6S4223 Hs.16026 
113540 AW152618 Hs.16757 
113502 T89130 

113^8 AI07GB38 Hs.12967 
Hs.11392 
HS.1B9S13 
Hb.10305 
Hs^81 
Hs^0862 
Hs.8198 
lfe.7246 
Hs,6295 
Hs.293147 



PCT/US02/12476 



Hs.296100 



H5.103812 

Hs.169248 

Hs.106469 

Hs^1616 

Hsi43O10 

Hs.283021 

Hs.6107 

Hs^1948 

Hs.15740 

Hs.125019 

Hs^896 

Hs.17948 



113203 AA743S63 

113195 H8326S 

113089 T40707 

113076 AF033199 

113009 T23699 

112937 AI694320 

112891 T03927 

112794 R97018 

112691 RB8708 

112602 AW004045 

112356 AH)3S316 

112210 R49645 

112084 AUM93g0 

111998 R42379 

111987 NM_015310 

111803 AA593731 

111737 H04607 

111605 T91061 

111510 R078S6 

111341 AL157484 

111280 AA373527 

111247 AW0583S0 

111232 A1247763 

110942 RS3503 

110924 AW058463 

110837 H03109 

110824 AI767183 

110776 /a032417 

110576 H60869 

110369 AK000768 

110099 R44557 

109984 Ar796320 

109958 AA001266 



Hs^0647 



Hs.6763 



Hs.9218 

Hs.194178 

Hs.16355 

Hs^4e3 

HS.1938S 

Hs.16762 

Hs.16928 

Hs^8419 

Hs.12940 

Hs.108920 

Hs.26942 

HS.19S4S 

Hs.37889 

Hs.107872 

H&23748 

Hs.10299 

Hs.133521 

HsJ0484 



MSTP043 protein 

Homosapfens mRNA; cDNA DKFZp586N0121 (f 
gbYp66al0.s1 Soares fetal fiver spleen 
Homo sapiens mRNA, ctironiosame 1 specific 
ESTs 

Homo sapiens mRNA; cONA OKFZp761l071 (fr 
ESTs, WeaMy slmBar to A Chain A, Human 
similar to mouse Xml / Dtim2 protein 
ESTs, Weaidy similar to AUUIJtUMAN ALU S 
K1AA1 102 protein 
zinc finger protein 106 
SEC63. endoplasmic reticulum Iranslocon 
hypotheMpn^naJ20093 
1HJ10970 



Homo sapiens cONA aJ11S91 fls. done HE 
ESTs 

Human DMA ^uenoe fion done RP11-196N1 



ESTs 
ESTs 

liypothstlcal protein FU23393 

ESTs. Moderately simlar to AUU1_HUMAN A 



20.20 
1&20 



43.70 
11.00 
14.00 



suppressor tfvarl (SMrevlsiae) 34ke 
ESTs, Ws^ SMarta AUJ8_HUMAN ALU S 
Homo sapiens cONA FU14445 fls, clone HE 



Homo sapiens cONA FU14839 Us, done OV 



Itypothetical protein FU23igi 
ESTs 

0b7e12(101.s1 Stratagenekino ^37210) H 
ESTs 

iMbs buluced growDi fiactor (vascular oi 

ESTs 

ESTs 

ESTs, We^ similar to S41044 chromosom 
ESTs 

stinger protein 204 
ESTs 

ESTs. Weakly sMIar to T17248 hypottie« 
ESTs. Moderately similar to A46010 
eb.-yq74fa08.s1 Soares liatal Sversften 
ESTs 
ESTs 

Homo sapiens done 23705 mRNA sequence 
ESTs 

Hmos^tem mRNA; cDNA DKFZp58601318 (t 
ESTs 

MAA0942 protein 

ESTs. Moderately similar to ALU5_HUMAN A 
ESTs 

ESTs, Moderately similar to PC4259 teni 
ESTs 

Homo SE?)iens mRNA; cDNA DKF^762M127 ffr 
CGI-Mproldn 

Homo sapiens mRNA; cDNA DKFZp5S4B2062 (f 
ESTs 



21.20 
14.33 



10.57 
26.60 
15.33 



fiizzled ((3rosophil^ homotog 4 
ESTs 

hypollieficad protein FU20761 
ESTs 

Homo sapiens cONA FU13S45 fls, done PL 



1Z20 
13.00 
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30 
35 



45 
50 



60 



85 





AWB18436 




solute carrier lanvly 16 {rmnocarboQilic 








ESTs, Weakly siinilar lo 138022 liypoflietl 




AI800S15 






1096B8 


R41900 














109S13 


H47315 


Hs.27519 






AW02i4a8 


Hs.26981 






AW193342 






109472 


AK001989 


Hs.91165 


hypotheGcal protein 




AAS24525 




DKI-ZP586C1620 protein 




AW978515 


Hs.131915 


KIAA0863 proteir) 








gl):zn98Q07.s1 StrstagefiB fetal retina 93 




BE219231 


H5.292653 






AA086005 








AL1 33092 




hypotneticai protein UKrZp434Ki42B 




NM_006770 


Hs.67726 


macroph^e receptor with odlagenous str 




AA055632 


HS.303D70 




108138 


AL049990 




Homo sapiens mRNA; a)NA [S\rZp564G112 [fr 


108087 


AA04570B 






108048 


AI797341 


HS.16519S 


Homo sapiens cDNA FU14237 lis, done MT 


108041 


AW204712 


Hs.61957 


ESTs 


107997 


AL049176 


Ks.82223 


chordin4iI(e 


107994 


AA036811 


Ks.48469 


UM domains containing 1 


107922 


BE1538SS 


Hs.61460 


Ig supeifamily receptor UiiR 


107681 


BE379594 


Hs.49136 


ESTs, Moderately slrT<ilar lo ALU7_HUMAN A 


107666 


AA010611 


Hs.60418 


EST 


107332 


T87750 


Hs.183297 


DKFZP566F2124 protein 


107292 


BE165479 




Homo sapiens serologicaly defined breas 


107230 


AI034467 


Hs.34850 


ESTs 


107168 


W57578 


Hs,237955 


RAB7, member RAS oncogene faniily 


107160 


AA314490 




KIAA1S63 protein 


107054 


AI076459 


Hs. 15978 


KIAA1272 protein 




AF264750 


Hs.288971 


myeloidyiymphbid or mixed-lineage leukem 


106999 


H932B1 


Hs.10710 


liypothetical protein FU20417 


106954 


AF1 28847 


HS.20403S 


iiidoleltiylarrinB N-metiiyl(ransferase 




AI933730 


Hs.26530 


serum deprivaiior response (phosplialidyl 


106865 


AW192535 


Hs.ig479 


ESTs 


106844 


M485055 


Hs.158213 


sperm associated anfigen 6 




NM_016831 




period (Drosophila) iiomolog 3 


106818 


AK002135 


HS.3S42 


hypothetical protein FU11273 


106797 


A1768801 


HS.169S43 


Itomo sapiens cDNA i-LJ13569 fis, done PL 


108773 


AA478109 


Hs.1 88833 


ESTs 


106747 


NM_007118 


Hs.171957 


triple fundionitl domah (PTPre^ bilaract 


106743 


8E613328 


Hs.21938 


hnjotha&al protein F1J12492 


106667 


AW360847 


Hs.1 6578 




108605 


AW772298 


Hs.21103 


H(8iio sapiens mRNA; cDNA DKFZpS64B076 (fr 


106567 


AW450408 


Hs.86412 


diromosome 9 open reading ftans 5 


106562 


AL031846 


HS.1521S1 


plal(opMIIn4 


106536 


AA329648 


Hs.23e04 


ESTs, WeaWy sWar to PN0099 sonS prot 


106533 


AL134708 


HS.14S998 




106507 


AA259068 


Hs.267819 


protein piiosphatase 1, regulatiHy (Innb 


106490 


AA404265 


Hs.1 15537 


putative dipeptidasa 


106474 


BE383668 


Hs.42484 


hypoUietical protein nJ10618 


106211 


AA428240 


Hs.126083 




105986 


AB037722 


HS.B707 


KIAA1301 protein 


105894 


AI904740 


Hs.25691 


receptor {(Sldtonin) acHvily modiymg 


105847 


AW964490 




ESTs, WasldysimllartoS658S7alplia-1C- 


105803 


AW747996 


Hs.160999 


ESTs, Modaraiflly dniliar to A58194 Uirom 


105731 


AAB34664 


Hs.29131 


nuclear receptor coacfiv^ 2 


105729 


H46612 


HS.29381S 


Homo sapiens HSPC285 fcRUK parflal cds 


105688 


AI299139 






105510 


Z42047 


Hs. 28^78 


Homo sapiefls PR02751 oiRMA, oompMis cos 


105101 


H63202 


Hs.38163 


ESTs 


104989 




HsJ!8S243 


hypothefical proteh FIJ22029 




A\M)8882S 


Hs.1 17176 


pdy(A}4)lnding protsini nudoar 1 


104869 


AI670947 




phosphatidylnositoM-pliaspltatB S-klnas 


104903 


AI436323 




Homo saj^s mRNA fcrKIAA1568pwleIn, 




AW015318 


Hs.23165 




104865 


T79340 


Hs.22575 


Homo sapiens cDNA: FU21042 fis, clone C 




AA035613 


Hs.141883 






AA099904 


Hs.21610 


DKFZP434B203 protein 


104776 


AA026349 




gb:499f01.s1 SoarBS_pregnanUilerus_NbH 


104691 






Homo sapiens bela-1 adrenergic receptor 
















gb:EST00057 HEoW Homo sspieiis cunA done 


104392 


AA076049 


HS.27441S 


Homo sapiens cONA FU10229 lis, clone HE 




AB002298 


Hs 173035 


KIAA0300 protein 


104074 


AL162039 


Hs!31422 


Homo sapiens mRNA; cONA DKFZp434HA229 (fr 


103749 


AL13S301 


Hs.8768 


hypotheScal protein FU10849 


103845 


AW2462S3 


H5.7043 


sucdnate-CoA Egase, GOP-forming, alpha 


103554 


AI878826 


Hs.323469 


caireoHn 1, caveolae protein, 22kD 


103S41 


AI81S601 


Hs.79197 


C083 antigen (acfivated B lymphocytes, i 


103498 


Y09267 


Hs.132a21 




103428 


BE383507 


HS.78S21 


A kinase (PRKA) anchor proteh) 1 


103353 


XB9399 


Hs.1 19274 


RAS p21 prateh adhater{6n>ase adiva 



15.00 
25.60 
14.20 
11.00 
26.00 



14.20 
51.80 
29J20 
10.73 
32.m 
17.40 
10.43 
11.40 



15.20 
10.44 



Z7.20 

11.20 
10.88 
12.00 
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103295 


X81479 


HS.237S 


egMke module containing, mudn-Gke, 


103280 


UB4722 


Hs.76206 


cadherin 5, ^pe 2, VE-cadherin (vascula 




NMJ105S74 




\M domain only 2 (rttombofin-lilcs 1] 




MM 002837 


Hs.1 23641 


prot^ tyrosine phosphatase, receptor t 




M18867 


Hs.1867 


progastricsin (pepsinogen Q 




BE24S169 


Hs.211610 


CUG triplet repeat, RMA-liinding protein 




U60808 


Hs.152981 


COP-dtacylgt^rol synlliase (ptiosphatlda 


102417 


AA034127 


Hs.1 53487 


signal transducing adaptor molecule (Stt3 




NM_003734 


Hs.1 98241 


antine oxidase, copper containing 3 (vase 




AA30S342 




protein tdnase C-tike 2 




AWt61552 


Ks.83381 


guanine nucleofide binding proton 11 


10218S 


U203S0 


Hs.78913 


chemokine (C-X3-C} receptor 1 








steroidogenic acute regUlatoiy protein 








spteen tyrosine Idnase 






Hs.75182 


mamose reoepiar, C type 1 




NM.002432 




myeloid ceD nudes' dlfferentation ant 




Hs.81256 


SI 00 cdcium-fairulinB protein A4 (caldum 




AP050658 


HS.2S63 


tachyMrin, precursor 1 (substance K, su 






Hs.2181 


complement component 5 receplor 1 (C5a 1 








gb:Human alpha satelite and satellite 3 




KM_000132 


Hs. 79345 


coagulaBonbctorVIII, procoagtdantco 








hydtoxyprostaglandin dehydrogenase 15^N 


101345 


NM_005795 


Hs.152175 


calcitanin receptor-ffice 




NMJD05732 




FBJ murine osteosaicaina \^ oncogene h 




L43821 


Hs.80261 


enhancer of fHafflentatan 1 (cas-Ete do 




BE29762B 


Hs.298049 


micrallbiilla'-^BSodatBd pralehi 4 




L358S4 




gb:Hiinian dystigphin (dp140) mRNA, S end 




NM_005308 


HS.211S69 






NM_003243 
X70697 


Hs.79059 




101088 


HS.S53 


solute carrier M|y 6 (neure^asmUte 


101066 


AW970254 


Hs.a89 


Chatot^eyden cr^ protetn 




BE379727 


Hs^ia 


tatty add binding protein 4, adipose 




BE245294 


Hs.180789 


S164 protein 




W25797xomp Hs.177466 


amyloid beta (A4) precursor protein (pro 


100716 


X8g8S7 


Hs.172350 


HR (histone cell qrde regulafion defec 


100SSS 


M69181 




gbrttaiai nonmuade tnyosbt heavy chalivS 


100425 


NM.014747 


Hs.7874a 


KIAAQ237 gene product 


100408 


D86640 


HS.S6045 


sro homaiogy thiee (Sl^ and cysteine ri 


100382 


D83407 


H8.156a07 


Down syndrome cfiUcal region gene 1-lk 


100351 
100299 


064158 
D49493 


Hs.2171 


gtBwth dSbreflfiaOon factor 10 


100134 


AA30S746 


Hs.49 


inacropl)aoe scavenger receplor 1 


100108 


UQ9S77 


Hs.76873 




100095 


297171 


Hs.78454 


myodin, trabecular meshwortc inducble 



TABLE 3B shows Bie accession nuiriws for those prtmakeys lacking unigenel ffs for Tabfe 3A. For each probeset we have feted the gene cluster number from w*k* the 
ottaonudaolkjes wars designed. Gene dusteis were oomi^led using sequences derived from Ganbarik ESJs and mlVlAs. These sequences were dusleied based os sequen 
sindlarfiy using austeiing and Allgnnwnt Tools (DouUeTwrs), Oakland CalMiO. The Goibank accession numbers for sequences oompri^ each duster are listed In the 



CAT numbec Gene duster number 



Pkey CAT number Accesskns 



123619 
126433 
125831 
126816 
126852 



122011 
120934 
123802 
116814 
118329 
104404 
104776 
113502 
101262 
108573 
101447 



AA399961 AA128347 
AA393283 AA3g8628 

AA811804AA809404AA2B6907AW977624 
AA431082 

AA226iga AA226S13 AA383773 



371681_1 AA60a64AA6092aO 
127143.1 AA325606AA099S17N89423 
1522905 1 H04043 {360988 060337 
122973_1 
136135_1 
273450_1 
200885_1 
7617_-2 
177S21_1 

genbankJM620448 
g8nbank_tt50834 
genbankJ483520 
H58762_al H58762 
genbankJUM)26349 
gBnbank_T89130T89130 
entiez_L3S854 U5854 
genbankJM08600S 
enlrBzJffi13aS M21305 
genbaiKJ)22401 N22401 
genbanlUUV1286S4 AA1286S4 
genbaniue7018 R97018 
enlrezJ3841S8 D8415S 
ggrJHT2245 Me9181 Mftl 105 U51039 



AA086005 



107 
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Table 4A shows 202 genes up-regulated In samples fiam ps 

Eos/Af^trix Hu03 Genechip array. Gf ' - 

the relative level of mRNA expression. 

Pkey. Unique Eos probeset identifier numl)er 
ExAccn; ExemplaAccesslonnumtjer.Genbanka 
UnigenelD: Unigeira number 
Unigene Title: Unigene gene title 



R1: 


average of Al for samples 1 






UnigenelD 


100113 


W-001269 


Hs.84746 


100187 




Hs.78183 


100210 






100225 




HS.1671B5 


100K9 


Nky]01949 




100438 






100877 


X80821 


Hs.27973 


100893 


BE245294 


Hs.l 80789 




Z11933 


Hs.182505 


101447 


M21305 




101649 


AW959908 


Hs.1690 


101724 


L11690 




101748 


NM_001944 


Hs.1925 


101809 


M88849 


H5.323733 


101879 


AA176374 


Hs^43886 


101915 


M=207881 


HS.15S185 


101973 


U41514 


Hs.80120 


102025 


U04045 




102031 


U0489B 




102052 


NM_0O2202 


Hs.505 


102391 


AA296874 




102420 


U44060 




102610 


U6S011 




1028^ 


NM_00S183 


Hs.60962 


103000 


NML0019re 


Hs.146580 


103507 


M13S09 
AJ000S12 


Hs.83169 
H5.296323 


10^ 


BE270268 


Hs.82126 


104660 


BE298665 


Hs.14846 


104896 


AW015318 


Hs.23165 


105038 


AW503733 


Hs.9414 


105298 


BE387790 


Hs.26369 


105510 


Z42047 


Hs.283978 


105667 


AA767526 




106073 


AL1 57441 


Hs.17834 


106205 


AWg65058 


Hs.1 11583 


106516 


AL137311 


Hs.234074 


106533 


AL13470B 


Hs.145998 


106575 


AW970S02 


Hs.105421 


106654 


AW075485 


Hs.286049 


106851 


AI458623 




106995 


AB023139 


Hs.37892 
Hs.1 83297 


107332 
107532 






107922 


BE1 53855 


Hs.61460 


108609 


BE409857 


Hs.69499 


108780 


AU076442 


Hs.117938 


109166 


AA219691 




109260 


AW978515 


Hs.131915 


109280 


AK0013S5 


H8.279610 


109292 


AW975746 


Ks.186662 


109384 






10941S 




Hs.110826 


109445 


AA92103 


HS.18991S 




AW967069 


HS.211S56 


109633 


AW00378S 


Hs.170267 


109786 


AI989482 


Hs.146286 


109958 


AA001266 


H3.133521 


110920 


N47224 


HS.20S21 


110924 


AW05B463 


Hs.12940 


111084 


H44186 ■ 


HS.1S456 


111132 


AB037807 


HS.B3293 


111229 


AW389845 


HS.1108S5 


111337 


AAS37396 


Hs.263925 


111987 


NHL015310 


Hs.6763 


112046 


AA383343 


Hs.22116 


112268 


W3g609 


Ks.22003 


112685 


R87650 


Hs.33439 


112871 


AL1 10216 


Hs.12285 


112897 


AW2064S3 


HsJ782 


112973 


AG033023 


KsJ18127 


112992 


AL1S74S 


Hs.133315 



chroujosome condensation 1 
aldo-keto reductase family 1, member C3 
KIAA0O42 gene product 
glutamate receptor, metabolropic 5 
E2F transcription factor 3 
topdsomerase (DMA) binding prolan 
KIAA0874 protein 
S164 protein 

POU domain, dass 3, transcriptlor facto 
gb:Human alpha satellite and satellite 3 
heparln-bindlng growth factor binding pr 
bullous pemphigoid antigen 1 (230/240l(D) 



PCT/US02/12476 
py or ladiolherapy. These genes were sdecled from 59680 probesels on (he 



by the average of AI for normal lung samples, 



27.20 
20.60 
. 20.40 
20.60 
29.40 



gap junction pnotairi, beta 2, 26kD (conn 
nuclear auloanGgailc sperm protein (his 
cytosoru: ovarian carcinoma antigen 1 
UDP-N-ac8lyVaIpha-[>galactosanine:polyp 
mulS (E coli) homokjg 2 (colwi cancer. 



transcripGon factor. UM/homeodoma 
deoxyguanoslne kinase 
Homo sapiens cDNA: FU21800 Its, done H 
preferentially expressed antigen In mela 



enolase 2, (gamma, neuronal) 

matrix metalloprotdnase 1 (IntersSdat 

serum/glucocorticoa regulated kinase 

5T4 oncofetal tiophoblast glycoprotein 

Homo sapiens mRNA; cDNA DKFZpS64O016 (fr 

ESTs 

KIAA1488 protein 

hypothetical protein FU20287 

Homo sapteifs PR02751 mRNA, complete cds 

paired box gene 5 (B-cell lineage specif 

downstraam ne^bar of SON 

ESTs, Vttealdy dmOarto 138022 hypofliotl 

Hbmo sapiens mRNA; eONA OICFZti761G02121 ( 



phosphoserine aminotransferase 

gb:lk048C9Ji1 Na_CGAP_Lu24 Homo sapiens 

KIAA0922 protein 

DKFZP566F21 24 protein 

Homo sapiens mRNA; cONA DKFZp762G207 (fr 

Ig superfamiiy receptor LNIR 

cdlagen, type XVII, dpha 1 

FtAB6 interacting, kinesin-llto (idbUnes 

KIAA0863 protein 

hypotheScal protein FU10493 

KIAA1702 protein 

ESTs 

trinucleotide repeat containing 9 
ESTs 

hypothetical protein MGC54S7 
ESTs 

kinesln family member 13A 
ESTs 

HMT1 (hnRNPrneihyttransferase, S. cerevi 

zhw-fii^ers and homertoes 1 

POZ domain containing 1 

hypolhelcalprDtBin 

ESTs 

USI-interading protein NUDE1, rat homo 
KiAAa942prot^ 

CDC14 (eel dlid^ cyde 14. S. eeravi 
solute carrier fafliBy 6 (newobansmttte 
ESTs. Weakly sbrilartsAUJIjnailAN ALU 
ESTs, Weakly tinflarto l55214saBvaiy 



21.80 
193.60 
38.40 
198.80 
78.60 
162.20 
SO.O0 
26.00 
37.20 

32.00 
51J!0 
13.90 
28.80 
110.60 
116.80 



42.60 
29A0 
21.50 
32.80 
20.20 
28.40 
25.40 
32.00 
40.60 
59.80 
43.40 
50.80 
53.40 
20.88 
23.60 
57.20 
49.00 
19.67 
48.17 
59.20 
28.60 
22.80 

21.00 
31.60 
24.20 
21.40 
20.40 
19.60 
24.00 
28.40 
38.00 
61.20 
24.60 
27.20 
48.00 
37.80 
26.80 
63.80 
26.40 
47.64 
22.00 
65.00 
42.00 
55.40 
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35 
40 



50 



70 







HS.86S3B 










Hs ^68626 








AA4S7211 






51.80 








Homo sapiens niRNA; cDNA OKFZp434E082 (fr 
ESTs 


28.20 








20.20 








hypofhfiScal protein FU 14627 








Hs.271616 


^Ts, Weekly sfmOar to ALUS HUMAN ALU S 






AW1 63267 










AA960981 










BE244930 


Hs.16E895 




30.20 




AW966931 


Hs.179662 








AA814043 


Hs.88045 


ESTs 










hypoths^csl pratdn FU10618 






BE545072 




t^pottefical pnAdn FU 10461 






AAB08001 
























ESTs, WeaMy dmilar to OAPIJKUMAN DEATH 
















AL133916 










AA889120 




















AF161470 












ESTs, WesMy slmSar to KCC1,HUMAN CALCI 


22!40 








Honn sapiens RiRNA for KIAA1771 prMi. 












20.0) 




AI8240(ffl 






19.40 
























AL050097 


Hs.272531 


DKFZPS8GB0319 ptotoin 








Hs 205442 


ESTs, Weddy slirilar to T34036 hypoUiBQ 










gb2rS9c10.s1 Soaes NhHMPU S1 Homo s^ 


























AVV976570 










AW450737 






















Hs 128708 








AA4S7200 




gb:ab19f02.s1 Stratagene lung (937210) H 








n5.1 12400 








AA421130 


H5.1 12640 












gojio97cu2.si Nd^CGAP^rZ rUflu> sapans 


















Hs^1630 










Hs.1 1 1801 










Hs.1 02670 








AWd2816o 


Hs.1 52684 








NM_014053 


Hs.270534 


FLVCR pnrisiR 














AA61QS2U 


nS.18l244 


cns^ HstocoRipafibOi^ comptex* dass 
















NM-013243 














prolafh kfaiaso (GAMP-^Bpandani caUyll 








H&l 58849 


Homo sapiens cONA: FLJ21683 SSi dcna C 






AL360190 




Homo sapiens mRNA fuQ length Insert cDN 








Hs,249034 ' 








AA 193325 




hypothetical protetn FU21 901 




























n8,2/893D 


hypothefical prat^ l*U12929 












23!20 
























AA648866 


Hs 151999 










Hs 173933 








AVv4o0979 




gt>'lil44-B(3-aIa-a-12-04JI.s1 NCf CGAP Su 




















ESTs, Modefstely similar to PC4259 feni 
















AVVZ97ZQ6 






























AA809672 


Hs.1 23304 


















AI022103 






19.60 








Homo sapiens, done IMAGE:3867243, mRNA 




128609 


l<nU0O3616 


HS.1024S6 


sunival of motor netum prolan Interac 


34!40 


128777 


AI878918 


Ks.10526 


<9staiae and glyein»ficli praie&r 2 


53.80 


128949 




H&88S0 


a (SsMegrin and metabfniotelnase doma 


23.00 


129168 


AI132988 


Hs.109052 


chiomosatm 14 open reading tame 2 


37.60 


129404 


AI26770Q 


Hs.317584 


ESTs 


28.60 


129527 


AA759221 


Hs.270847 


dsBa^uGn 


40.80 


129574 


AA02681S 


H3.11463 


UMPX»/IPIdnse 


31.20 


129598 


N3C436 


Hs.11556 


Homo sapiens cDNA FU12566 lis, done m- 


29.60 


12978S 


Higooe 


H3.184780 


ESTs 


72.20 


12M70 


AVISOS 


Hs.298ig8 


dimnussome 12 open reading fiame 4 


2Z20 
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130149 


AW067805 


Hs.172665 


130199 


Z48579 


Hs.172028 


130441 


U63630 


Hs.1 55637 


130466 


W19744 


Hs.180059 


130462 


AW409701 


Hs.1578 


130817 


M90516 


Hs.1674 


130703 


R77776 


Hs.18103 


130732 


AW890487 


HS.639S4 


130867 


NH1.001072 


H3.284239 


131028 


AI87916S 


Hs.2227 


131086 


AU)35461 


Hs^81 


131284 


NM-001429 


Hs.25272 


131775 


AB014S48 


Hs.31921 


131B60 


BE383676 


Hs.334 


131945 


NM_002916 


HS.3S120 


132040 


NfcL001196 


Hs.315689 


132084 


NM_002267 


Hs.3886 


132389 


AA3103g3 


KS.19Q044 


132437 


AA1S2ia6 


HS.48S9 


132SS0 


AW969253 


HS.17019S 


132B17 




Hs.5338 


132632 


AU076916 


Hs.5398 


132672 


W27721 


Hs.54697 


132742 


AA025480 


HS.29M12 


132771 


Y10275 


Hs.56407 


133070 


U92&49 


Hs.64311 


1331S3 


Aro70S92 


Hs.66170 


133181 


X91662 


Hs.66744 


133282 


AA449015 


Hs^145 


133350 


M499220 


Hs.71573 




A\/652066 


HS.7S113 


133658 


AA319146 


HS.7S426 


13386S 


AB0111S5 


Hs.170290 


134032 


NM.005025 


HS.76SS9 


134125 


»IM_014781 


HS.S0421 


134158 




Hs.79428 


134321 




Hs.8172 


134367 


A^4« 


Hs.82285 


134570 


U86S15 


Hs.172280 


134753 


NMJM16482 


HS.17313S 


135002 


AA448542 


HSJ2S1677 


135029 


HS8818 


H5.18;^ 


135047 


AL134197 


HS.93S97 


135345 


X536S5 


HsJ9171 



methylenetetrahydrofblate dehydtogenasa 
a disintegrin and metafloproteinase doma 
prolan kinase, DNA-aciivatedi catalytic 
Homo sapiens cDNA FIJ20653 lis. done KA 
bacukwiral lAP repeal-containing 5 (sur 
ghjtaniine-fnictose-6-pho5phatB transamin 
ESTs 

cadherin 13, H-cadtierfn (lieart) 

UOP glycosyllransferase 1 family, polype 

CCAAT/enhancer binding protein (OEBP), 

chromogranin 6 (secretogranln 1) 

ElAUnang protein p30O 

K1AAD648 protein 

Rlio guanine nudeofide exchange tador ( 
rei^lcaOon factor C(acfivatar 1)4(37 
Honio sapiens cONA: F1J22373 fls, done H 
karyopherin alpha 3 (fmporiin alptia 4) 
ESTs 

cydinLania-6a 

— )|ic protein 7 (osteogenic 



guanine monpl»s|iiate syntlietase 
Cdo42 guanine exdianga factor (GEF) 9 
ESTs, Wteakly sMIar to T33468 ttypoDiBK 



HSKkUpfotdn 
twist (Drosophilajhonujog (acrocephalos 
8RB7 (sivfmssar of RNA polymerase B, ye 
hypolheBc8lpniiBlnaJ10074 



serine (orcj^ne) pratalnase M 
KIAA0203geiiepfaduc( 
BCU/adenovlrus E1B lakOhterading pro 
ESTs. ModerBlely similar to A46010 X-On 
phosptwribo^rlglydnaniUe tormyltiansfer 
SWI/SNF ielate4 maMx assodated, ec8 
dual«pedlid^^fRsineKy}i)hosphoiyl 
GanligenTB 

• ■ •j(174ieta)d8'-' 



19.40 
21.40 
110.00 
25.20 
40.60 
24.60 
21.00 
33.40 
60.80 
20.40 
29.40 
32.40 
27.40 
75.60 
31.36 
32.40 
23.40 
61.20 



49.20 
20.20 
20.80 
37.60 
53.40 
31.60 



TABl£4Bsh(»wlheaecesaJonnuinbeisfcrlhosepiinwl(6yslad(inaunlg8ndD^fbrTabte4^ Fweadipioliesattw haw listed lt» gene duster number ftwnv^ 
dtaonudsofides woe designed. Gene dustan were omip9ed using 8equem:es derived limGenbank ESTs and mRN^ These sequenc« were dustered based on sequence 
dmllaiHy using Oustering and Alignment Toote (DoubleTwH. Oakland CsSSame). The Ganbank aocessian numbers for sequences oomprisbig eadt duster are listed in the 
"Accession" cdumn. 



CAT number. Gene duster number 



Pkey CATmnnber 

123619 371681_1 AA602964 AA609200 

126433 127143 1 AA325606AA099517N89423 

12^72 142698J AW450979AA136853 AA136655AW419381 AA984358AA492073BE168945AAa090S4AW238038BE011212BE011359 

BE011367 BE011368 BE011362 BE01121S BE011365 8E011363 

106851 322947_1 AI458623 AA639708 AA485409 R22085 AA485S7D 

118720 aeiil>ank.N73515 N73515 

120S1S genbankuAA2S8356 AA2583S6 

117099 321871_1 H93699 H97976 H80036 

101447 enbezJM21305 M213QS 

123130 genbanKJUV487200 AA4872ao 
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T^e 5A shows 680 genes upnegulated in squamous eel caidiiaina or sdenocardnoma king tumois rdaGve to nonnal hing and chioi&aUy diseased lung. Tliese genes «fGFe 
selected ftom 59680 probesets on the Eos/Af5mi8bfxHii03Genecl<pairay^ Gene expression data ftx each probesMobtanedttoniBiis analysis was expressed as awe^^ 
nlensi^ (Al). a normalized value reflecting the relative level of mRNA expression. 

Pliey: 
ExAccn: 
UnigenelD: 

UnlgenaTiHe: Unlgene gene title 
R1: TOIhpercent'leofAt for squamous ceScardnoi 

diseased lung samples. 

RZ 80th percentila of Al adenocarcinoma lung tumor samples divided by file 90lh percenKe of Ai far normal and chronically diseased lung saitipfes. 

R3: 80lh perceniae of Al squamous cell caidnoma lung honor samples divided the 90th pacentlle of Al far nomrf and chronically disaased lung samples. 

R4: 80thpereenl«8ofAladawxanSnomalungtuinorsamplesdlvIdedbylh880thpatcenlilaofAIfors(piamo^ 

RS: ■ TO penanUlB of Al for squamous cell caidnoma and adenocardnonw lung hirrfflf samite nSnuslhelSthperoenfile 

diseased lung and lunwr samples dhrided by 90lh percenSle of Al fiy nomid and chfoidcaDy iSsease^ 

normal hmg. chnxdcallydiseased hing and himorsanqiles 



ia and adenocarcinoma lung himor samples divided by the 90th percentile of Al for nonnal and chronicaliy 



UnigenelD Unlgene Title 



25 
30 



45 
50 
55 
60 



100071 A2S102 

100114 X02308 

100154 H60720 

100187 D17793 

100188 AW247090 
100202 BE294407 
100216 AA48930S 
100269 NM_001949 
100287 AU076657 
100297 AU077258 
100330 AW410976 
100335 AW247529 
100360 W70171 
100372 NM_0U791 
100474 NM_000699 
100486 719006 
100491 056165 
100516 090278 
100522 XS1S01 
100559 
100576 X003S6 
100629 AA01S693 
100661 BE623001 
100677 AA353686 
100896 014887 
100709 N26539 
100761 BE208491 
100830 AC004770 
100667 U14622 
100S02 M16a29 
100906 AU076916 
100960 J00124 
10104S J05614 
101061 NM_000175 
101071 L02840 
101124 L10343 
101175 U82671 



101204 L24203 

101210 L29301 

101216 AA284166 

101228 AA333387 

101233 AL13S173 

101273 Z11933 

101342 US2112 

101346 AI73a816 

101389 NhU)00892 

1013S6 BE267931 

101431 BE185289 

101448 NM_0Q0424 

101462 Al.035668 

101466 8E262660 

101484 AA053486 



Hs.82962 

HS.B1892 

Hs.78183 

Hs,57101 

Hs.99910 

Hs.13g0 

KS.11B9 

Hs.1600 

Hs.la2429 

Hs.77152 

Hs.6793 

Hs.7S93g 

Hs.184339 

Hs.300280 

Hs.10842 

KS.27S163 

Hs.11 

Hs.99949 

Hs.1640 

HSJ7058 

Hs.21291 

Hs.132748 

Hs.57813 

HS.1216B6 

Hs.100469 

Hs.295112 

Hs.4756 

Hs.287270 
Hs.5398 
Hs.1 17729 

Hs.1 80532 

Hs.84244 

Hs.)12341 

Hs.36g80 

Hs.737g8 

Hs.82237 

HS.23S3 

Hs.84113 

Hs.82916 

Hs.878 

HS.182S05 

Hs.182018 

HSL77348 

Ha.1901 

Hs.76996 

Hs.1076 

Hs.195850 

Hs.73853 

H3.170197 

Hs.20315 



AFn< control: GAPDH 

AFFXconlrotGAPDH 

AF(=X control: GAPDH 

Human GABAa receptor alpha-3 subunii 

ttiymldyialB synthetase 

KIAA0101 gens product 

aldo-keto reductase family 1 , member C3 



prateascme (prosome, macropsdn) subunR, 
E2F transcription factor 3 
cbaperonin oontaining TCPl, subuntt 5 (e 
protein dteidUde isomerase^elaled pnt 
mlnichroniosoma maintenance delclent (S. 
platelet-acfivating factor acelyihydrala 
uridine monophosphate Idnase 
KIAA01 75 gene product 
amylase, alpha 2A: pancreatic 
RAN, member RAS oncogene bmHy 
non^elaslalic cells 2. protein {NM23B) 
caidnoembiyonic an&gen^lated cdl ad 



collagen, lype VII, alpha 1 (epidermoiys 



mHogen-acBvaled protein Idnase Idnffie 
Homo sapiens ribosoma! protein L39 mRNA, 
zinc ribbon domain containing, 1 
general fransciipflon factor ilA, 1 (371c 
nvelold/lymphdd or mUed-Bneage leuttem 
KIAAa618 gene product 
flap stnichire-spedfic endonudease 1 
gb:Kuman transl(e1olase-lite protein gene 
rat protfr«u»gene (multiple endocrine n 
guanine monphosptiate synthetase 
ketafin 14 (epidennolysis bullosa sbnple 
gb:Kuman prol^rating cell nuclear anfl 
glucose phosphate Isomerase 
potasdum vollag&^ated channel, Shab^e 



melanoffla anfigen, family A, 2 
macrophage nrigration Inhlbilory betor ( 
atada^elangiectasia group O-assodated 
o^ld receptor, mu 1 
cydln^endent kinase Inhibitor 3 (COK 
chaparonin containing TCPl. subuitit 6A ( 
soibital dehydrogenase 
POU domain, dass 3, transcription facto 
Merieukia-I recaptor-assodalBd kinase 
hydraivprostaglandin dehydrogenase IS^N 
fcailikrein B, plasma (Fletcher facto) 1 



small proline^ protein IB (comllin) 
keratin S (epidermolysis bulosa dffl|^ex 
bone morphogenefic protein 2 
ghitamitxixatoacetlo Iransamtoase 2, mit 
intarfson^nduced protein with lelrabi 
gb:Huinanpaidiiymldhonnon»f«lalsdpro ■ 



38.80 
12.00 



80 
85 



101577 VI34353 

101649 AW959908 

101663 NM_003528 

101664 AA4369a9 
101669 124498 



Hs.1041 

Hs.1690 

Hs.2178 

Hs.121017 

Hs.80409 



hepaiii>«nd!ng growUi factor Unding pr 
H2B histone family, member Q 
H2A histone family, member A 
growth arrest and nNA-damage4nducibfe 
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101695 M69136 Ks.13S626 

101724 L11690 

101748 NM-001944 

101759 M80244 

101771 NM_002432 

101804 M86699 

101809 M88849 

101833 AU076442 

101842 M93221 

101851 BE2S0984 

102002 NM_002484 NsJ1469 

102039 AL134223 Hs.306(S8 
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Hs.620 

Hs.1925 

Hs.184601 

HS.1S3837 

Hs.1 69840 

HsJ23733 

Hs.117938 

Hs.751fl2 

Hs.82045 



102083 T3S901 

102111 L36196 

102123 ^nL001809 

102154 U17760 

102193 AL036335 

102217 M829978 

102224 NM_002810 

102234 AW163390 

102251 NM_004398 

102305 AL043202 



102340 U37055 

102348 U37S19 

102368 U39817 

102394 NM.0O381& 

102404 NM.005429 

102537 U57094 

102581 AU077228 

102605 Ai435128 

102610 U65011 

102623 AW24g2aS 

102642 AA205847 

102654 AV6499^ 



Hs.78743 
Hs.75117 
Hs.81884 
Hs.1594 
HS.75S17 
Hs.313 
HSJ01613 
Hs.1 48495 
Hsi7B554 
Hs.41706 
Hs.90073 
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AI859431 H20478 AA218882 AA757465 AA100995 A1864135 AI934^ AA070503 H47008 AA219846 W61039 W93907 AW3850S0 W37g67 
W78028AA189007AA479136 R93650AA442312 T30287AA847628AA180262AA009649 C03892AW149464AA310963AA219693 
AA069747 R29207 AA094784 AA29361S AA447848 AI9B4167 N90393 C05097 N56499 AW292351 AW149681 AW473258 AA629322 A1004409 
AW105577 AI954937 AI811 070 AA902422 AW514437 AA535460 AA916877 AW517122 AA974657 AA97S649 AWS17130 AW517129 F31737 
W076B8 AA19364S AA37B994 AA4S9273 F32267 W39303 AA021 181 N86810 AA406524 AA062553 AA436B01 H08985 H1597g N40310 
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AA436789AA232172AVV36(»78W2S662 R60282AA436530AA378894/W187461AI940535Mfi0« 

AA209340NS6ir4N88374AA191088AWa47691 AA249013AM93111 AA972536AW298S94AA375893T12139W2B186AW243849 

AI2B8629 AAB43996 W1S260 Atl 88^ AW248079 R1S836 
119599 genbai*_W45552 W45552 
112382 genbemk.R59904 RS9904 
105284 genbanliLAA227934 AA227934 
100071 enbez_A28102 A2ai02 
123315 714071 1 AA49636g AM96646 



Table 6A shows 99 genes up^ 



er idafive to smakBis wilh lung ca 



r. These genes vfere selected fmm 59680 probesels on the 



Pkey: Unique Eos protjeset idenfifier number 

ExAccn: Exenplar Accession i " " 

UrigenelD: Unigene number 

UntgeneTHIe: Unigene gene GHe 

R1: averageofMiorsainplesfiQinnoiKniokenwiaiad 

R2: average ofAl (or sanftolhmnon^nwkers with squamous ce 



UnlgenelD Unigene Title 



101174 
101296 
101304 
101806 
101972 
102274 



104307 
106131 
106672 
106872 



109247 
109830 
110193 
110234 
110644 
110885 
111057 
111950 
112291 
112956 
113009 
113060 
113073 
113074 
113121 
113125 
113757 
113848 
113884 
113936 
114875 
114987 
115460 
115722 
116261 
116830 
116970 
117178 
117757 



120404 
120524 
120688 



ExAccn 

BE379727 Hs.83213 
L17330 Hs.280 
Y12490 Hs.85092 
AA001021 HS.S685 
AASe6894 Hs.1 12408 
S82472 

U30930 Hs.1 58540 
NM_003816 Hs.2442 
U92015 

XS2509 Hs.161640 
X98266 

L02911 H3.150402 
HS.4S033 
Hs.21355 
AW3730B2 Hs.83623 
" Hs.196701 
HS.29&244 
Hs.30643 
Hs.18282 
HS.32S01 
Hs.194478 
Hs.S7a87 



BE51478a 
H47233 
156887 
AA156238 
Z4384S 
AA035375 
AA100796 
AB0t8S49 
BE219231 
AA314907 
R44607 
A1004874 
H244Sa 
R942D7 
AW274992 
T79639 
AF071594 
R53972 
Z43784 



Hs.22672 
Hs.310764 
HS.3208S 



AK001335 

T48011 

AA96a672 



W52854 

AI333076 

W17056 

AA23S6Qg 

AA251018 



W91892 

AA481788 

H61037 

AB023179 

H98675 

AF088019 

AA287747 

AF217525 

AI822106 

AA923278 



Hs.103042 

Hs.31137 

Hs.8764 

Hs.8929 

Hs.18631 

H8.27099 

Hs.28529 

HS.83&23 

H5.236443 

Hs.87808 

HsJ8613 

HS.59B09 

Hs.190150 

Hs.70404 

H$.9059 

HSJZ69034 

Hs.46732 

Hs.173012 

Hs.49Qa2 

Hs.49g02 

Hs.290905 

Hs.96427 



Hs.72249 
Hs.14629 
HS.1104S7 



btty acid binding protein 4. adipoqrte 
pro-T/NK cell associated protein 
thytoid hormone receptor Interactor 1 1 
fliyroid hormone receptor Interactor 8 
S100 caldunvbindng protein A7 (psorias 
gtebela -pol=ONA polymerase beta (axon a 
UDP giycosyltransferase 8 (UDP-galactose 
a disintegiin and m^loproleinase doma 
gb:Hunian clone 143789 defedive mariner 
tyrosine aminotransferase 
gb:H.sa;jens mRNA for tgase fke pnotel 
aclivin A receptor, type 1 
lacrimal proline rich protein 
doublecortln and CaM Idnase-fllce 1 
nuclear racepJor subfajnily 1. group I, m 
ESTs, Weakly similar to M.U1_HUMAN ALU 
SNARE ptotein 
ESrs 

K!AA1134 protein 
ESTs 

Homo sapiens mRNA; cDNA DKFZp43401572 {I 
ESTs. Weatdy sunllar to K1AA0758 protai 
gb3m26o06.sl Slratagene pancreas (93720 
MO-2 protein 

ESTs. WeaMy similar to T2684S hypothat! 

ESTs 

ESTs 

Homo sapiens mRNA; cDNA DKFZp434M082 (fr 
EST 

ESTs. Highly similar to type II CALM/AF1 



AA2fit852 HS.19290S ESTs 



Wolt-Hlischhom syndnime candid^ 1 
ESTs 

ankyrin 3. node of Ran\rter (anisytin G) 
ESTs 

hypothetical protein FU14827 
miciotubule-associated ptotein IB 
pmt^n lynssine phosphatase, receptor t 
EST 

hypothetical protein FU11362 
ESTs 

hypotheBeal protein FLI232M similar to 

chromosome 12 open reading frame 2 

nuclear receptor subbndy 1, group I, m 

Homo sapiens mRNA; cDNA DKF2:p564Nia63 ( 

EST 

ESTs 

ESTs 

ESTs 

ESTs, Wealdy similar to ALU2JtUMA)« ALU 

KIAA0962 protein 

ESTs 

EST 

ESTs, Wealdy simnar ioMBOlO X-IInked 
Down syndrome cell adheston molecule 
ESTs 

ESTs. Weakly simflarto potease [asapl 
KlAA1013pn)teiR 



13.50 
16.50 



17.(H> 
16.50 
IIJX) 



AW207555 Hs.97093 



Homo sapiens cONA: FU23004 fis. done L 
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121558 


AA412497 




121676 


H56037 


Hs.108146 


121936 


A1024600 


Hs.98612 


121938 


AA428659 


Hs.98610 


122177 


AA435789 


Hs.98833 


123442 


AA299652 


Hs.1 11496 


123551 


AA60B837 




123756 


AA609971 


Hs.112795 


123861 
124371 


AA620840 
N24924 


Hs.188601 


127477 


SE32B720 








ffeWIOM 


128252 


AA45S924 


Hs.192228 


128426 


AI26S784 


HS.14S197 


128925 


R67419 


Hs.21851 


128S45 


AI990S06 


H3.8077 


129105 ^ 


AI769160 


Hs.108681 


129235 


AW977238 


Hs.126084 


129508 


AB02Q684 


Hs.1 1217 








130160 
130340 


AA30568B 
082326 


Hs.287695 
H$.239106 


131220 


AB023lg4 


HS.3008S5 


131430 


AI879148 


Hs.26770 


132114 


NM.006152 


Hs.402)2 


132458 


AA935315 


HS.4896S 


132847 


NIO08327 


Hs.54432 


132655 


D49372 


H&54460 


132682 
132747 


AI077500 
AA345241 


Hs .54900 
H5.SS950 


132812 


R50333 


Hs.92186 


133337 


AF085983 


Hs.293676 


133876 


AL134906 


Hs.771 


134119 


AW157837 


Hs.79226 


134464 


AA302983 


Ks.239720 


134542 


M14156 


Hs.85112 


135002 


AA448S42 


Hfc251677 



PCT/US02/12476 



gb:affi9301.s1 Soares.tesSsJlHT Homo sap 



Homo sapiens cDNA FU12900 fis. done NT 
Homo sapiens mRNA; cONA DKFZ|>547E184 (fr 
Homo saptens brain tumor assodaled pml 
KIAA10S protein 
KIAA0877 protein 

oiridwlal glycopioldn 1. 120kO (m«in 9 
UDP^^UMtaeieNAo beta 1 ,3-galaBlDsyltr 
solute carrier fmOf 3 (cysflne, dbas) 
KIAA0977preleIn 
fatty add tending protein 7. brain 
lyinphoid-restrtcted membrane protsai 
Homo salens cONA: FU216g3 fis, done C 
sialyibansferase 4B (beta^adosidase 
sfflaHlndudble cyteldne sub^ndjr A (Or 
seidogically defined oolan cancer anSg 
ESTs, VUeaMy sImDar to KIAA1330 proteh 



phosphcrylase, glycogen; Dver (tfeis ds 
fesdculallon and elongation proteh zet 
CCR4^0T IranscripUon complex, subunit 



20.00 
11J0 
17.50 



as ware clustered based on sequence 



CAT number Accessions 
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Table 7A shows 98 genes down-regulated h non-snwkere imOi lung cancer relaSra fo smokers with lung caicer. These genes were seleciad fmm 53680 probesels on <he 
Eos/Af^melrix Hu03 Genechip array. Gene expression data for each probesei abtamed ftom this analysis was expressed as average intensity (Al), a normafized value reflecting 
the relative level of mRNA expression. 

Pkey: Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

Unigenell}: l/nigene number 

Unigens Title: Lftugene gene tilla 

R1: 90tti percenfile of Al for samples from smokars wRti adenocandnoma divided by the average of Al for samples from non-sinokeis wRh adenocaielnoma 

R2: SOBi percentile of Al for sanies 6om snWkers wtOi squamous cell carcinoma divided by ttie average of Al for samples from nmvsmoteis wSi squamous cdi 



Pkey 
















Hs.78ia3 


aldo-keto reductase family 1 , member C3 












neumfalastoma (nenre Gssue) protein 






100576 


X00356 


Hs.37058 


calcitonin/calcltDfiin-related polypeptid 


102.40 






BE379727 


Hs.83213 


fatty acid binding protein 4, adipocyte 




















AW970254 




Charol-I.eyden oystal protein 










Hs .36980 


melanoma anflgen, fainlly A, 2 












hameobaxA5 








MM_003528 




H2B tvstone tamSy, member Q 








NM_000715 




oomplefflent component 4-t>lndlng protdni 








M83700 


H5.1S0403 








101941 


S77S83 




gbllERVKI Q/HUMMTV reverse transcriptase 








NM_006456 


h(s.288215 


slalyitransferaso 






102242 


U27185 


Hs .82547 


relinoic acid receptor responder (tazaro 






102340 


U3705S 


Hsi78657 


macrophage sUmulaSng 1 {hepatocytsgro 






102369 


U39840 


Hs.299867 


hepalocyte nuclear factor 3, alpha 






102457 


NM_001394 


Hs.2359 


dual spedlidty phosphatase 4 






102669 


U71207 


Hs.29279 


eyes absent (Diosophil^ liomalog 2 




65.70 


102796 


AL07g646 


Hs.107019 


symplekbi; HuntngSn interacting pmte! 




58.80 


102829 


NM_a06183 


Hs.80962 


neurotensin 




268.80 


103207 


X72790 




gb:Human endogenous retrovims mRMAfor 


70.00 




103242 


X76342 


Hs.389 


akx)hol dehydrogenase 7 (class iV), mu o 




21Z10 


103260 


X78416 


HS.31S5 


casein, alpha 




130.70 


1033S1 


X89211 




gb:K.saplens DMA for endogenous relrovir 


64.60 




104212 


AB00229a 


HS.17303S 


K1AA0300 protein 






104252 


AF002246 


Ks.210863 


cell adiiesion molecule with homology b 


63.80 




104258 


AF007216 


Hs.5462 


solute canier femily 4, sodium bk:aifaon 


94.40 




105024 


AA126311 


Ks.9879 


ESTs 


68.20 




106260 


AI097144 


Ks.5250 


ESTs. WaaWy similar to ALU 1_HUMAN ALU S 






106440 


AA449563 


HS.1S1393 


^utamate-cystelne Egase, catalyOc sub 






106566 


66298210 




gb:6011180t6F1 NIH_MGC_17 Homo sapiens c 


73.20 




106605 


AW772298 


Hs.21103 


Homo sapiens mRNA; cONA DKF2pS64B076 [tr 


83.80 




106614 


AA648459 


Hs.335951 


hypothetical protein AF301222 








AW075485 


Hs.286049 


phosphoseitne aminotransferase 






106999 


H93281 


Hs.10710 


hypolhefical protein FU20417 






108700 


AA121518 


Hs. 193540 


ESTs, Moderately similar to 2109260A B c 






108810 


AW29S547 


Hs.71331 


hypi^etical proteoi MGCS3S0 






108857 


AK001458 


Hs.62180 


ffliillin (l3rosophlia Scraps hofflolog), act 








AA989362 


Hs.293780 


ESTs 










Hs.12860 










AI743860 


Hs. 12876 




































Hs.293147 


ESTs, Moderately amllar io A46010 X-U 








All 57425 


HS.13331S 


Homo sapiens mRNA; cONA DK(^76U1324 (f 










Hs.103042 


microtubule-assadated protein IB 










Hs.21948 










AA278300 




Homo sapiens cONA: I-U23123uS,GloneL 








BE54S072 


Hs.122579 


hypothetical protein FU 10461 








AyV905328 


Hs. 180842 


ijt)osomal protoin L13 


66.40 




llfflM 


AW872527 




ESTs, Weakly similar to OAPl.HUMAN DEATH 




226.60 


11S98S 


AA001732 


Hs.173233 


hypolheScal protein FU10970~ 


82.80 




116107 


AL133916 


Hs.172572 


hypothetk^al protein FU20093 




381.60 


116552 


D20508 


Hs.164649 


hypothetical protein DKFZp434H247 


69.00 




116571 


D4S652 




^HUMGS02848 Human adult lung 3 direcl 


64.20 




118466 


N66741 




gb.7z33g08.8l Morton FOd Coddea Homo 

EST 




63.50 


120484 


AA253170 


Hs.96473 


81.60 




120983 


AA398209 


Hs.97587 


EST 




81.10 


121034 


AU89951 


H5.271623 


nudeoportn 50kO 
ESTs 




66.20 


121423 


AW9733S2 


Hs.290585 


64.40 




122SS3 


AA451S84 


1^190121 


ESTs 




60.40 


122946 


A1718702 


KS.30802& 


m^or histocompatibility complex, dass 


186.60 




123130 


AA487200 




gb:ab19f02.s1 Stratagenelung (937210) K 




80.20 


124472 


N52517 


Hs.102670 


EST 


71.00 




1245^ 


N62096 


Hs.293185 


ESTs, WeaMysimSarto JC732BaniiRoad 




104.90 


125489 


H49193 


Hs.124984 


ESTs, ModerayyslmnartoAUJ7JHUMAN A 




7^00 


125731 


R61771 


Hs.26912 


ESTs 




69.90 


125747 


NltiL(»2884 


HS.86S 


IW1A, member of RAS Offlsgene famly 


63JX) 




126020 


H79863 


Hs.1 14243 


ESTs 




6Z40 


128547 


U47732 


Hs.84072 


transmeiribrane 4 supeifamily member 3 




62.80 


126966 


R38438 


HS.182S7S 


soltitscanterfamlly1S(H>/t)eptldetia 




60.10 
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AA9q(Jo67 


ns.lsu^' 1 






AW293496 


Hs 180138 






AI022103 


H*^11916^ 






AW889132 








AA6S0274 




fifaronectin leudno fich baitsnienibTsnB p 




AW160432 


Hs.296460 






AW935187 


Hs.170162 


KIAA1357 protran 




AB040930 


Hs.126085 


KIAA1497 protein 






Hs.132390 


zinc fingsr protein 36 (KOX 18) 




AW067800 


Hs. 155223 


sianniocaloin 2 




AW890487 




KIAA1467 protsin P"*^ 




AB040900 








BE^1914 




Homo S8|)i8ns eONA FU1 1640 fis, clone HE 








KIAA0648 piotein 




A501S324 




KMAOTSI fvctds 




NM_i001448 
AA093322 


Hs.58367 




132977 


Ks.301404 


FffllAUidng motif protein 3 


133749 


120852 


Hs.10018 


solute canief f sni^ 20 (phosphata Iran 


133818 


AI110684 


HS.764S 


fibrinogen, B Iwta polypeptde 


134264 


AFU9297 


Hs.8087 


NA&5p!ptein 


13426S 


M83772 


Hs.80a76 


flavin contdring monooxygenase 3 


134346 


X84a02 


Hs.82037 


TATA box Unding pretain (TBI>)-asso(^ 


134395 


AA45ES39 


Hs.a262 


lyscBomal-assodated membiane protein 2 


135047 


AL134197 


H$.93S97 


cydbHiependent kinase S, legulatary su 


135056 


N7576S 


Hs.93765 


apoma HMGIC fusion partner 


135309 


A1564123 


Hs.42500 


ADP-ribosytafion Cactor-iike 5 



70.20 
64.00 
85.20 
96.60 



64.40 
76.20 
97.80 



133.20 
341.00 



TABI£ 7B shows Hie accession nunibsis for ttwse piimelwys lacldng un^alO's for Table 7A. For each probeset we have Gsted the gene cluster number fiom which Oie 
oligomjcleolides were des^ned. Gene dusteis wrare compled u^g sequences derived tram Genbank ESTs and mRNAs. These sequences vrare dtisteied based on sequence 
simSari^ using Chisfeilng and Afignment Tools (OoubteTwist. Oakland (MBbrria). Tf- « — -k ^-ta, ^ 1!.^^ u «,= 



Pkey: Unique Eos prabesdidenlSer numbw 
CAT number. Gene duster number 
Aocession: Genbankeocesstonnuadws 



106566 

116571 genbank_D45652 

11S466 genbankJ166741 

101046 entre2_K01 160 KO1 160 

101941 entiez_S77583 S77583 

103351 enlrezj(8921 1X89211 

123130 senbanleAA487200 
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Table 8A Shows 1720 genes eiflier up or d« 



PCT/US02/12476 



lie exptesdon data br each prabeset obtained trom Oils analy^ was expressed as average intensity (At), a 



Pkey: Unique Eos proljesetldantifiernumlMf 

ExAccn: Exemplar Accession number, Genbank accession nirniber 

UnigenelO: Unigene number 

Unlgene Title: Unigene gene title 

R1: TOttipereenHeofAl for lung lunws divided by 90tlipeieentilBOfAI (or normal Icing 

R2: 7DIIipenxnlilaorAlfireluonicallydiseasedlungdlvMedby9(HhpereenilJeofA^ 



Bcey ExAcen UnigenelO Unigene Title 



R1 



AI916973 Hs.213603 

AW189787 Hs.147474 

A16866S1 Hs.218286 
AI308300 

At389963 Hs.197505 

AW274682 Hs.1613S4 

AI469095 Hs.2g8241 

AI707881 Hs.202090 
Z4230B 

Ai859947 Hs.314158 

AW270150 HS.2S4516 

A1421S41 Ks.146164 

R10367 Hi.307921 

At3629S7 Hs.132221 

AW135830 Hs.233955 

X8S711 HSJ1B38 
W27363 

AW118822 HS.1287S7 

AI216113 Hs.126280 

A1623332 Hs.130S41 

AA235361 H&96B40 

A1492471 Hs.188270 

AI888147 H5.22Q615 

Z44342 Hs,22958 

AI582897 Hs.192570 

AW449802 Hs,28S901 

AI89035& Hs.127804 
; AA504860 

I A1041019 Hs.152454 

AW204069 Hs.312716 

AA593373 Hs.293744 

AAS65209 Ks^9439 



ESTs 5.46 

ESTs 0.58 

ESTs 4.26 

gbiag0c06j(1 NCt_CGAP_Bm20 Homo sapien 0.62 



Transmembrane protease, serine 3 
ESTs 

gb:HSC0FBt21 normalized 'rtot brain cDN 

ESTs 

ESTs 

ESTs 

EST. Weaidy rimilar to Z232.HUMAN ZINC F 
Iiypolhefieal protein FU12401 
liypolhetlcsl protein FIJ20401 
hypdl»flcdpfotdnFU11191 
gbab37d01j1 Stratagene HeLa cell s3 93 
ESTs 

hypothe&ai prddn FU23393 
1^1542 pratein 
K]AA1527protdn 
ESTs 

ESTs. VfcaMy sMIarhi T03829 transcrip 



AI927208 
AW136973 
AA677570 
AA729905 
AI142118 
AA737594 
Ai808751 
AA7M115 
AW297762 
AA84398S 
AI819198 
AA912B39 
AW4S0466 



Hs.288516 
Hs.185918 
Hs.231916 
Hs.129004 
Hs,247606 
Hs.121188 
HS.12B3S0 



AW272467 
AI878034 
AI733621 
AI077462 



F07744 

AA3842S2 

AAS81004 

X17033 

R20002 

T71508 

T78054 

AI991127 

AA344647 



T91418 

N40B34 

NHA_001501 

AJ238381 

AI286176 

AW044300 

AW269818 



liypdIieBcal protein FU22028 
Honw sapiens cONA nJ20428 lis. done KA 
ESTs. WeaUy simaar to T1 7233 hypolhell 



ESTs 

ESTs, Weakly similar to unnamed pnotete 

ESTs 

ESTs 

ESTs. Weaidy similar to AF20a846 t BM4G 
ESTs 

ESTs, WeaUy dmilar to 869890 mHogen t 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, WeaMy simtiarto JC5423 2-liydraxy 



Hs.190Sa8 ESTs 



Hs.1 31099 

Hs.133011 

Hs.134084 

Hs.73737 

Hs.27453 

Hs.1 08323 

Hs.7987 

Hs.286132 

Hs.62180 

Hs.271986 

Hs.6823 

Hs.1 3861 

Hs.1 17202 

Hs.1 16724 

Hs.27973 

Hs.125156 

Hs.234g5 

Hs.129715 

Hs.132576 

Hs.6786 

Hs.137506 

Hsi3244 



117 (HPI^ 
ESTs 

spndng factor, a#ilne/serhe*h 1 
ESTs, Moderately sindlar Id G01251 Rar p 
ubiquilIn<on]ugaSng enzyme E2E 2 (liomo 
DKFZP434Ft 62 protein 
D1SF37(pseudogene} 
annUn {Orasopltila Scraps ttoimiog), act 
integrin. alpha 2 (CIM9B, alpha 2 sabuni 
hypothetical protein FU10430 
ESTs, Weakly slnflar to pH sendtftro max 
gb:yc97g09j1 Soaiss intent brabi INIB H 
ESTs 



ayl.n 



crBll 



KIAA0374pn)tdn 
transcripliond adaptor 2 (A0A2. yeast, 
hypolhetica! prolan FU112S2 



124 



10 



35 



55 
60 
65 
70 
75 



WO 02/086443 

302155 AI088485 Hs.144759 

302201 AJ006276 Hs.159003 

302202 AF0971S9 Hs.1 59140 
302206 A1937193 Hs.41143 
302209 AFC47445 Hs.«9297 
302235 AL049987 Hs.166361 
302290 AL1 17607 Hs.175563 
302328 AA354849 Hs.23240 
302346 A1J039101 Hs.19462S 
302360 AJ010901 Hs.198267 
302384 YQ89B2 Hs^02676 
302406 U867S1 Ks.2119S6 
302409 AF1SS156 HsJ2ie028 
302423 AB028977 Hs^74 
302432 ALOSOOeS HS.272S34 
302435 AF092047 Hs.227277 
302437 AB024730 Hs^473 
302455 AA3S6923 H&240770 
302472 AA317451 Hs.6335 ■ 
302476 AF182294 Hs^41578 

302489 T80660 Hs^30424 

302490 AA885S02 HS.1S7032 
302S62 /UOOSSaS Hs.48956 
302566 AA08S9S6 HsJZ4B572 
302830 ABa2S488 Hs^lOO 
302634 AB032^ Hs.173S60 
302638 M463798 Hs.102696 
302647 X57723 Hs.l 98273 

302655 AJ227892 Hs.146274 

302656 AW293005 Hs.70704 
302668 AAS80691 Hs.180789 



302711 L08442 

302719 W69724 

302742 Lia)59 

302755 AW384815 Hs.149208 

302771 H98476 Hs.42522 

302789 AJ245067 

302795 AJ24S313 Hs.272B38 



PCT/US02/12476 



302802 . YDB250 

302803 AA442824 
302812 N31301 
302847 X98940 
302885 AL137763 
302943 AI581344 
302977 AW263124 



303011 AFD9040S 

303013 R17898 

303061 AF1S1882 

303077 AF163305 

303090 AA443259 

303091 AF192913 

303094 AF195S13 

303095 AF202061 
303131 AW081061 
303195 AA0S2211 



Hs.127812 
Hs.315111 
Hs^4139 



Hs.27693 

Hs.146286 
Hs.130683 
Hs.278953 
Hs.134079 
Hs.103180 



303216 AA581439 

303222 AA33353a 

303234 AA132255 

303251 AW340037 

303295 AA20^ 

303297 T80072 

303316 AFQ33122 

303467 AA398801 

303506 AA340S05 

303552 AA359799 

a)3598 AA382814 

303637 AF056083 



303756 AI738488 



303893 N88597 

303907 AW467774 

303946 AW474195 

303978 AWS13315 

303981 AW513604 

303990 AW515465 

303998 AW518449 

303999 AW516611 
304006 AW517947 



Ha.1 15838 
Hs.180532 
Ks.113503 
Hs.171880 



Hs.143951 
HS.115B97 
Hs.208057 
Hs.13423 
H8.14125 



ESTs 

transient rec^tor potenKal channel 6 
UDP-G^te)aGlcNAc beta 1,4- galactosylt 

killer cell lecfin-like receptor subfami 
Homo sapieiMi mRNA; cDNA DKFZp5B4F1 1 2 (fr 
Homo sapiens mRNA; cDNA DKFZp5B4N0763 (f 
Homo sapiens cDWi HJ13496 fe. done PL 
dynein, cytoplasmic, DgMlntemiedlaJe 
mucin 4, (radieobronchial 
synaplonemal complex protein 2 
C03^pdlon-associated pretsln: aniisens 
adaptor-related pfotdn complex 4, epsil 
KIAA1054prolBin 

Homo sapiens mRNA; cONA DKFZp564J062 (fir 
sine oculls homeobm (Drosophila) honrak) 
UOP-N.ac8tylglueosamine:a-1.3-[>fliannosId 
nuclear cap binding piotein subuntt 2, 2 
SWI/SNF ralated, maMx assoclatBd, actl 
U6 snRNA^socialBd SmJke prol^ LSmS 
Ho/no sapiens cDNA HJI3S40 lis. dona PL 
ESTs 

gap iunctlon piQiebi, beta 6 (coimeidn 3 
hypalh8BeellpraMnFLJ22965 



odd OzAsiHn homotog 2 {OrasoptiDai mous 
MCT-1 pratan 

NADH dehydrogenase (utaiquhone) 1 beta s 
ESTs 

Homo sapiens, done IMAGE2B23731. mRNA. 
S164pn>tdn 

gb;yu8Gg1 1.r1 Weizmann Oiracloiy Epithet 



gbiHuman Imniunogtobulh heavy chidn. V-r 
gb:Human autonomoiBly replcallng seqinn 
hypotHeScal prol^ nJ2092a 
gb:Homo sapiens (done WR4.10\/H) anfi-Oi 
KIAA1S55 protein 



hypothetical protein aJ10494 
Sb:lisaplen5 mRNA (or variable region of 
ESTs. Moderately sMIar to pulaliva DNA 
hypotheHcal protein aJ20051 
gb'.Ksapiens rearranged Ig heavy chain { 
hypothetical protein LOCS7822 
ESTs, Weakly similar to T17330 hypotheti 
hypothetical protein FU12S94 
Homo sapiens cDNA: FU23137 fis. done L 
gb:Homo sapiens done 2A1 scFV anitbody 
RAB22A. member RAS oncogene bmlly 



Mnssln family member 13 
zinc finger protein 180 (HHZ169) 
Pur-gamma 



0C2 protein 
myosin, HgM I 



Hs.152328 ESTs 



ESTs 

ESTs, WsaMysliiilarbHcniolog of ratZ 
ESTs. Weakly similar to unnamed proldn 
gb£ST960g7 TesSs I Hono sai^ens cDNA 5 
phosphatWic add phosphatase type 2C 
ATPase. {Na*)fl<* transpoittig. beta 4 po 
ESTs 



kaiyqiheiin Omportin) beta 3 
polymerase (RNA) tl PNA dtrecled) polyp 
Komosa^ns cONA FU12383 fis, done MA 
gbato43c12jt1 Na_C6AP_Ut1 Homo sapiens 
ESTs, Weakly similar to AIJU1_HUMAN ALU S 
gb3(u71a11 J(1 Na_CGAP_Kki8 Homo sapiens 
ab3dS8fD5.x1 Na_CGAP_Ut2 Homo sapiens 
gb3qi70b11J(1 Na_CGAP_0(39 Homo sapiens 
gb3d66h02a1 NCLCGAP_Ut2 Homo sajtons 
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WO 02/086443 

304008 AW5iai98 HsJ297 

304009 AW518208 Hs.181165 
304024 T03038 

304026. T03160 

304028 T03266 

304036 T16855 Hs.244621 

304046 TS4803 

304081 T81S21 

304063 T62536 

304097 R2S376 

304114 R78946 

304122 H28966 

304155 H68698 

304203 N56929 

304234 W81608 

304267 AA064862 Hs.73742 

304270 AA069711 Hs.297753 

304287 AA079288 Hs.78486 

304348 M179868 

304415 AA290747 Hs.169476 

304430 AA347682 

304456' AA411240 

304521 AA464716 

304526 AA476427 

304542 AA482602 Hs.169476 

304548 M486074 Hs.297681 

304607 AA5133Q 

304640 AAS24440 Hs.1 11334 

304550 AA527489 Hs.3463 



Hs.13801 
Hs.284136 
Hs.297753 
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304760 AA5S0401 

304849 AA588157 

304917 AA6a2685 

304S21 AA603092 

304986 AA613893 

304987 AA618044 
305016 AAG26876 
3(e034 AA630128 
305072 AA641012 
305111 AA644187 
305148 AA^70 
305159 AA659166 
305190 M665955 
305232 AA670052 
305235 AA670480 
305245 AA67669S 
305312 A/V700201 
305322 AA701597 
305394 AA720942 
305413 M724659 
305447 AA737B56 
305476 AA745664 
305483 AA748030 
30^ AA769156 
305612 M782347 



Hs.169476 

Hs.81328 

Hs.1 63019 
Hs,300697 



3QS616 AA782884 

305637 AA806124 

3058% AA806138 

305650 AA807709 

305690 AA813477 



305728 AAB28209 



30S792 /M8452S6 

30S864 AA864374 

305901 AAS72968 

305910 AAS7S9S1 

306015 AA897116 

306017 AA897221 

3060^ AA897630 

308063 AA906316 

306065 AA906725 

306104 AA9109K 

308109 AA911861 

306148 AA917409 



306325 AAg53072 H5.210546 

308353 Mg613B2 Hs.275885 

306375 AAg88650 Hs.276018 

308396 AA97Q223 

3064:^ AA97S110 Hs.191228 

306442 AA97G899 

306446 AA977348 



gb3p38g12^1 



eukayoSc translation elongation factor 
gb:FB21B7 Fetal brain, Stratagene Homo s 
Sb:FB26F2 Fetal brain, Stratagene Horno s 
gb:FB7C1 Felal brain, Stratagene Homo sa 
(ibosomd protein 814 
gb:yb42d06.s1 Stratagene fet^ spleen (9 
gb7b73g01.s1 Stratagene ovaiy (937217) 
gbyc04c12.s1 Stratagene lung (937210) H 
raxsmial protein, large, PI 
gb:yi87g02^1 Soares placenta Nb2HP Homo 
gb:yin3la06.s1 Scares in&nt brain 1 NIB H 
gbvr78b0S.s1 Soares fetal liver spleen 
gb:yy82d08.st Soaresjnultiplejsderosis. 
gh2d88h06.s1 Soares_felaLhearLNbHH19W 
tibosomal protein, large, PO 



;le 937209 H 



gb£ST54044 Fetal heart II Homo sapiens 
^:zv26g05.s1 Soares_KhHMPu_S1 Homosapi 
gb:zx82c11.sl Soares waiy tumor NbHOT H 
gb2x02cOS.s1 Soar8S_toiaLfalus.Nb2HF8_ 
glyceraldehyde^-phosphate dehydrogenase 
serine (or cysteine) prota'nase InhibHo 
gb»h8Se08.s1 NCLCGAPJBrl.1 Homosaplen 
(eniih,Gghtpolypai)lide 
ijbosoma) protein 823 

gbaim75h1U1 Na_CGAP_GD9Hamasa|dens 

gb3in13g09.8l NCLCG«>_Co12Hc 

KIAA1685 protein 
PRO2047 proton 



immunoglobulin heavy constant gamma 3 (G 0.90 

gbzu89Ml&s1 Soaras_lBsSOIHT Homo sap 6.46 

gbab39c04,s1 Stratagene hmg (93721Q) H 1.00 

Ob3ii72a12:s1 NCLCSH>-Pia4 Homo sapiens 5.68 

ESTs 1.48 

gbiit01g08.s1 Na_CGAP^Lym3 Homo sapiens 1.76 

EST, Weakly simflar to EF1 0_HUMAN ELONG 1.00 

gb:ag57d1 2^1 Gsssler Wilms tumor Homo s 5.31 

gtyceraIdahyde-3-phospliate dehydrogenase 0.78 

Sb2g37e01.s1 Jia bone marrow stroma Hom 3.11 

nudearAdorofkappaHshtpolypeptid 4.38 

gbaj441D7a1 SoarBSjBlaUlver_8pteen_ 2.13 

EST 1.20 

immunoglobulin heavy constant gamma 3 (6 1.16 

gbaJIOroasl Soaies_paratbyfoM_tujnor_N 5.86 

gb:nx10cOB.s1 Na_CGAP_GC3 Homo sapiens 2.21 

hypolheticiprrteinFU11726 3.36 

EST 1.00 

gbaiz12e0S.»1 NCLCGAPjGlBI Homo sapiens 6.44 



Ks.272572 . ^ 

gb»j09h02:s1Soam4_paraliiyraldJunx)r_N i.uu 

HS.27586S ribosomai prateh S18 7.57 

flb»e29a12.s1 MCLC«3AP_Pr25 Homo sapiens 4.78 

gb»e23c12.s1NC3_CGAP_Pi25Hon»sapiens 0.89 
gb:nw31e04.s1 Na_CGAP_GCBOHoniosapiens4.49 

gb-.ai67aOS.s1 SoaresJes6a_NHT Homo sap 4.91 

Hs.73742 ribosomal protein, large. PO 0.19 

gb»l34aQ2.s1 Na.CGAPJ<ld6 Homo sapiens S.12 

gbak72b06.s1 Bantead spleen HPLRB2 Hom 1.66 

gteak84aa6.slBaisisads|deenHPLRB2Hom 2.34 

Hs.73742 clbosamaipniteln.larBe.PD 0.30 

gb»h63h08.s1 NCI_CGAPJ«5 Homo sapiens Z10 

gb:nx21hOZs1 Na_CGAP_GC3 Homo sapiens 0.32 
gb:am08bO7.s1 SoarBS_NFl^T_GBC_S1 Homosl.SS 

KS.1030S8 ribosomal protein S6ldnas8,g(%0, polyp 5.21 

Hs.130027 EST 1.96 

gbMkO3g03.s1 SoarBS_NFU.TjGBC_Sl Homo s 73S 

flb»k78g02.s1Na_CGAP_GC4 Homo sapiens 7.19 

gb»k8Sh1 Ul Na_CGAPJ<U3 Homo sapiens SJSO 

gb»g21a07x1 Na_CGAP_WS1 Homo sapiens 4.21 

Hs.288in6 IRNAIsopentanylpyrophosphatebansleras 2.20 

gbMo60g04.s1 Na_CGAP_Ui5 Homo sapiens Z84 

gbx)JS3hOS.s1 Na_CGAP_HI^3 Homo sapiens 1.60 

interteuWn 21 receptor 1.65 

ribosomal protein S18 3.78 

EST. Moderately similar to X4662 ribos 4.30 

gb:op09d05 j1 NCLCGAPJOdS Homo sapiens aflS 

hypotheticd proldn FU20284 3.19 

gb:oq35e09.s1Na_CGAPjGC4Homosaplens 4.67 

gb:oq72el2.s1N(a_CGAPJ<id6Komosaplens 3.92 
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12.19 



126 



30 
35 
40 
45 
50 
55 
60 



WO 02/086443 

306458 AA978186 
306467 AA983S0a 
306510 AAS88546 



Hs.163593 
Hs.276083 



305572 AAS9S686 



306598 AIOOO^O 

30^05 AI000497 

306656 Ai004024 

308876 Ai00S603 

308686 A1015615 

X6702 AI022S6S 

308728 A1027359 

308751 AI0325B9 

306767 AI038963 Hs^49118 ESTs 



gbMp33c065l SoaresLNFU_T_GBC_Sl Hotnos 
ribosoinal proldn LI 8a 

gb;or84d07^1 NCLCGAP_Lu5 Homosapiens 
EST. WeaMy sMarto RL23 J1UMAN BOS R 
gb:ouS7e08j 1 NCLC6AP_Br2 Homo sapiens 
^Ms2Sc^U^ NCLCGAPJOdS Homo sapiens 
gtKOSIBclOLSl NCLCGAPJOdS Homo sapiens 



PCT/US02/12476 



Hs.2e4136 
Hs.307670 



306956 Ai125111 

306958 AI1251S2 

307035 A1142774 

307041 AI144243 

307091 AI167439 

307181 AI1B9K1 

307297 AI20S798 

307317 AI208303 

307327 A1214142 

307382 AI223158 

307410 AI241715 

307415 AI242118 

307423 AI243206 

307426 AI243364 

307517 A1275055 

307551 AI281556 

307561 AI282207 

307608 Aia0295 

307657 A)30&428 

307691 A13182B5 

307701 AI318583 

307718 AI333406 

307730 AI336092 

307760 AI342M7 

307764 AI342731 

307783 AI347274 



H5.111334 
Hs.147333 
Hs^46381 
Hs.147885 
Hs.77039 

Hs.179573 



AI358722 Hs.276737 



307852 AI365541 

307902 AI380462 

307997 A1434512 

308002 A1435240 

308011 /U439473 

308023 AI4S2732 

308041 AJ458824 

308059 A1468938 

308085 AI474135 

308101 A)475950 

308106 AI476803 

306122 AI480123 

308154 A1500800 

308171 A1523532 

308211 AI5S7029 

308213 AI557041 

308216 AI557135 

308219 A1S57246 

308271 A1567844 

308319 AtS83g83 

308382 At613519 

308413 At636253 

308450 AIS80860 

308464 AI672425 

308588 AI718299 

308599 AI719893 

308615 AI738593 

308643 A1745040 

308673 Af760864 

308697 AI767143 

308762 AI807405 

308778 AI811109 

308782 AI811767 

308808 AI8182S9 

30M23 AI824t1S 

308875 A1832332 



ribosomal piotein, large P2 
gb»u11M7jt1 SoaresJNFl_T_GBC_S1 Homo s 
PRO2047 protein 

gb:o¥29f10J(1 Soates_tesflsJIHT Homo sap 



gb:qa7Shl2jt1 SoaresJelaLheartJNbHHigW 
gb:qa33c0&s1 Soares.NhHIMPu.SI Homosapi 
gb:am66«)3.s1 Barslead spleen HPLRe2 Horn 
gb:am55e09j(1 Johnston fronlal cortex Ho 
ribosomal ptotdnL13a 

gb:qb85bl2j(1 Soares_fetaLheart_NbHH19W 
gb3ix70>iO&s1 Soare5j<hHMPu_S1 Horn sapl 
gbx|c99gaax1 So3ras_pregnanLute(US_NbH 
{BfriGn, OgM polypepfide 
EST 

CD68 antigen 
ESTs 

ribosomal protein S3A 

gbxih92b02j(1 Soarre_NFU_T_GBC_S1 Homos 
co!liagen,typel,alpha2 

gbx)ti30g11Ji1 Soares_NFl^T_GBC_S1 Homos 
tf):<ll72d03.x1 Soares_NhHMPu_S1 Homosapi 
gb:quS2f1lj(1 Na_CGAP_Lym6 Homo sapiens 
gb.'qp65a12j(1 SoafOS_^Jung_NbHL19W 
gb:qmO1f02J(1 Soares_NhHMPu_S1 Homo sapl 
ribosomal proton S19 

gb:lbin)0lJ(1 NCI_CGAP_Ov37HonnsqHens 
EST, WeaUy Mar to RL6_HUMAN 60S Rl 
small nuclear ilbonui^pratein polypepl 
gbx)t43b07j(1 Soaras>tdJunaJa)HL19W 
gbxjt27f07j<1 Soares_pregnanLutBnis_NbH 
gb2io26alJ7jc1 Na_CGAP_lJi5 Homo sapiens 
gb*05d02j(l Na_CGAP_Co16 Homo sapiens 
gbiit18S)9j(1 Na_CGAP_GC4 Homosapiens 
gb3it09d02jt1 Na_CGAP_GC4 Homo sapiens 
gbJitD9g03a1 Na_CGAP_GC4 Homo sapiens 
gbx(t94a11j(1 Na_CGAP_Co14 Homo sapiens 
EST. WeaMy similar to RSHU22 ribosomal 
gbqzOSgOSjil NCLCGAP.CUI Homosapiens 
gb:to02l)05j(l Na_CGAP_CU.1 H 



Hs.1 69476 
HS576877 
Hs.181165 
Hs.181165 

Hs.309411 



gbli60a08j(1 NCI_CGAP_Lym12 Homo sapien 3.7S 

hemoglobin, ^hal 0.36 

glyceraldehyde-3-phosphatedeliydF0saflase 4.36 

EST, WeaWy sWlar b RL10_HUMAN 60S R 1 .8C 

eukaryoHclransJafion elongation tetor 3.3{ 

eukaryoBctranstaGon elongation feetor O 
gb4i77e12j(1 Soares_NSF_F8_9W_OT_PAJ»_S2J8 

EST 2.71 

gbin93d08j<1 Na_CGAP_UI2 Homo s^iens 0.6e 

ESTs, Weakly similar to sdila>an4 [Mmu 2.4{ 

anaptasfic lymphoma ldnase(KI-1} 2.4! 

gb:PT2.1 12_E04jbjmoi2Homosajtoi8cD 3.3< 

gb:PTi1_13_K06Jturoor2Homos^nscD 4.61 

gb:PTZ1_15_D07xtumot2 Homo sapiens cO 4.83 



Hs.181165 
Hs.105749 
Hs.196511 
Hs.96e40 
Hsi77117 



Hs.259408 

HS.21B6 

Hs.217493 



KIAA0S53proteta 
ESTs 
KIAA1S27 protein 

EST, Moderately slndlar to 138055 myosi 
gb:»51g12*l Barstrad aoita HPU»6 Homo 
gb:as47d07j(1 Barstsad aorta HPLRB6 Homo 
hypothetical |mteioaJ23045 
gh:tn9a12j(1 NCLCGAPJW3 Homo sapiens 
gbni»i09c10jt1 NCLCGAP.CLLI HOmosapbns 
gb.-wi97807jt1 Na_CGAPJ<ld12 Homo sapien 

8bA04c11Jc1 NCI_CGAP_Ow23 Homo sapiens 
eukaiyolio IrarelaBon elongaHon factor 
^:wti52c01jc1 NCLCGAP_pr22 Homo sapiens 
8nnexInA2 

gteat48903j(1 Baislead colon HPLRB7 Homo 
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30B879 


AI832763 




thymosia beta 4, X chromosome 


3088B6 


AI833240 




gb:al76d10j(1 Barstead colon HPLR87 Homo 


308898 


AI8S8345 




gbMl32diax1 Na_CGAP_Ut1 Homo sapiens 




AI365023 




phos^diatidylinositol glycan, dass H 


308966 


AI870704 




gb.-wl47h01jc1 NQjCGAP.Utl Homo sapiens 


308979 


AI873111 




gbniyIS2h05j(1 NO CGAP_Bm25Homosapien 


309045 


AIS10902 




gb:iq39mij(1 NCI CGAP.UtI Homo sapiens 


309051 


AI911S7S 




gbawJTSdOljd Na_CGAP_lji24 Homo sapiens 


30S069 


AI917366 


Hs.78202 


SWI/SNF related, matrix assodated, act 


309083 


AI922426 


Hs.1 19598 


ribosomal protein 1.3 


30910S 


AI92SS03 


Hs^658B4 


ESTs 


309122 


Ai92B178 




gbnvo95a11j(1 NCLCGAP_Kid11 Homosapien 


309128 


AI928816 


Hs.1 80842 


ribosomal protein LI 3 


309164 


AI937761 




gbnMp8«)09.x1 KCI CGAP_Bm25 Homosapien 


309177 


AI951118 




glKWXfi3gOSj(1 NCLCGAPJBde Homo sapiens 




AI991525 


Hs^9426 


ESTs 

gb:wq66e06j(1 Na CGAP_GC6 Homo sapiens 




AW004823 




gb.-ws93a08j(1 N(£cGAP_Co3 Homo sapiens 


309411 


AW08S201 


Hs.244144 




309437 


AW(ffi0702 


Hs^78242 


tubulin, alplia, ulvquitDus 


309459 


AW1 17645 


Hs.65114 


keralinIS 


309476 


AW1 29368 




Sb3(e14b05j(1 Na_CGAP_Ut4 Homo sapiens 


309499 


AW136325 


Hs.279771 


Homo sapiens clone PP1596 unknown mRNA 


309529 


AW1 50807 


Hs.1 81357 


lamintn receptor 1 (67kD, ribosomal pre 


309532 


AW151119 




gb:xg33e10j(1 Ka_CGAP_Ut1 Homosaidens 


309626 


AW192004 


H$.297681 


serine (or cysteine) proteinase lnl4bK 


309641 


AV\/194230 


Hs.253100 


EST, (Moderately similar to GKHU Ig gamm 


309675 


AW205681 


Ks.253506 


EST, Moderately simila^ to ATPN.HIMAN A 


309693 


AW237221 


Hs.1 81 357 


iamlnln receptor 1 (671(0, ribosomal prot 




AW238011 


Hs.295605 


marmosklase, alpha, class IK men^ 2 


309700 


AW241170 


Hs.179661 


tubulin, beta polypepfide 


309747 


AW2648B9 




gb:xq^ti02j(1 NCI CGAP_Lu28 Homo sapiens 


309769 


AVV272346 




gb3(s13c10j(1 Na_CG/^lKld11 Homosapien 


309782 


AVV27S1S6 


HS.1S6110 


immunoglobulin k^pa constant 


309783 


AW27S401 


Hs.254798 


EST 


309799 


AWZ76g64 




^»pS8lrf)1 .x1 NC1_CGAP_Ov39 Homo sapiens 


309866 
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323338 


R74219 


Hs.23348 


323348 


AA2330S6 


Hs.191518 


323351 


AA704103 


Hs.24049 


323359 


AA234172 


Hs.137418 


323360 


AA716061 


Hs.161719 


323405 


AW1 39550 


Hs.1 15173 


323420 




Hs.263780 


323434 


AW081455 


Hs.120219 


32344S 


AA253103 


Hs.1 35569 


323449 


AA28286S 


Hs ,2841 53 




H00978 


Hs.20887 


323501 


AA182461 


Hs.84520 




AI6522B7 




323515 


AA282274 


Ms. 256083 


323541 


AI185116 


Hs.1 04813 


323545 


AIB1440S 


Ks.224559 


323635 


R63117 


Hs.9691 


323675 


AAg84759 


Hs,272168 


323678 


AL042121 


Hs.20880 


323691 




Hs. 145599 


323893 


AW297758 


Hs.249721 


323746 


AW298611 


HS.12B08 


323774 




Ks.321056 


323856 


AA355264 


Hs .267604 


323857 


T18988 


Hs.2936ffl 


323870 




Hs.129212 


323876 


AUM2492 


Hs.147313 


323885 


AA344308 


Hs.1 28427 


323911 


AL043212 


Hs.92550 


323919 


AAS62973 


Hs.220704 


323972 


Ai869964 


Hs.1 82308 


324005 


AA610011 


H5,208021 


324036 


AI472078 


Hs.303662 


324055 


AAS28794 


Hs.1 28644 


324063 


AW292740 


HS.272B13 


324072 


AA38182g 




324092 


AW289931 


H5.202473 


324095 


AW377983 


Hs.298140 


324129 


AI381918 


HS.28S833 




AWS0486a 


HS.2S8836 






Hs.225740 






Hs.28631 




A1j047634 


Hs.231913 




AA4290S8 






AU)4802S 


Hs.l 24675 




AA432032 


Hs.304420 












Hs.12B173 






Hs.300410 






Hs.14S078 


324341 


AW197734 




324343 


AW4S2016 


Hs.293232 


324371 


AA4S230S 


Hs.270319 


324382 


AVV502749 




324384 




KS.1276K 




F2^^ 




32438B 


AI924963 


Hs!3062Q6 


324432 


AA464510 


HS.152B12 


324497 


AW1 52624 


HS.1363W 


324510 


Ai148353 


Hs.287425 


324580 


AA492S88 




324582 


AAS06935 


Hs.132036 


324633 


AA572994 


HS.32S489 


324640 


AW2gSB32 


Hs.134798 


324675 


AW014734 


HS.1S7969 



PCT/US02/12476 



ESTs 3.28 

ESTs 3.38 

Homo sapiens cDNA: FU23075 (is, clone L 0.06 

ESTs 10.18 

Komosa|)iensDC29inRm.con^c(is 1.48 

HoiiiosaptensmRNA:c(MADKFZD547E052(fr 3.08 

ESTs 2.31 

ESTs 5.38 

gb:DKFZp762K2310_r1 762 (synonym: hinel2) 2.38 

|]lectebtntMniclogy-ia(edaniain.(arrily 1.06 

ESTs 0.73 

NAAISTSprola'n 5.25 



Homo sapiens cDNA- HJ21578SS. Clone C 1Z6I 

ESTs 4.42 

ESTs 2.98 

ras homolog gene family, membw A 1.98 

S#3seldnase-associatedpio(sai2(|)4S 1.62 

ESTs 1.00 

ESTs 1.43 

ESTs 0.34 

ESTs 3.01 

ESTs 1.90 

ESTs 0.29 

ESTs 2.27 

ESTs, WeaMy similar to NEUROD (Ksapien 0.43 

Fanconi anemia, complementation group A 3.19 

hypothetical protein FU10392 2.70 

ESTs ZM 
gb:EST382593 MAGE lesequences, MAGK HoimZZI 

ESTs 169 

RP42h(atiotog 1.20 

ESTs 1.25 

Homo sa]xenscONA:FU23249fis. done C a27 



3.33 
1.00 

ESTs Z01 

MARK 4.11 
Homo sapiens mRNA:cONADKFZp586F1322(f 2.08 

hypoUie«cal prolan FU104S0 3.42 

ESTs 5.97 

ESTs 3.17 

ESTs 0.36 

Homo sapiens BAC done RP11-335J18fR)m Z31 

ESTs 4.38 

ESTs S.80 

ESTs 3.10 

ESTs 6.34 

ESTs 1.00 

ESTs 0.86 

dual oxidase 1 0.45 

gb:EST94a55 Activated T.cd!s I Homo sap 2.82 

Homo sapiens cDNA:FU22278 (Is, done H 2.40 

Homo sapiens cONA:FU22S02 lis. done H 1.32 

Homo sapiens cDNA:l^J22135 lis, done H 1.40 

iiypothelical protein HJ12673 4.24 

ESTs 6.88 

Homo sapiens cDNA- FU22141 lis, done H 0.81 

ESTs 2.42 

ESTs 3.62 

ESTs. We^ simlar to T14742 hypothefi 0.14 

ESTs 3.71 

gb:OKFZp761P1910_r1 761 (synonym: ftamy2) 0.95 

ESTs 4.08 

ESTs 5.88 
fegolatorof dineienSaSon (In S. pomb 

ESTs. WeaMy dmlv lo un — ^■ 



KIAA1349proieln 
KIAA1491 protein 
hypothetical pioldn FU11215 
ESTs 

ESTs. Weakly similar b unnamed prolan 
Homo sapiens cDNA FU1 1 569 fe. done HE 
gb3«g99c08.8l NCLCGAP_Tliyl Homo sapiens 
ESTs. Weakly sbifla' to ALU1J1UMAN ALU S 



137 



wo 02/086443 



PCT/US02/12476 



324699 


AWS04732 


Hsi1275 


hypoBteOcalpmlelnFUIIOII 


a93 


0.93 


324747 


AA603532 


Hs.130807 


ESTs 


1.57 


1.81 


324748 


AA657457 


Hs.292385 


ESTs 


1.55 


1.34 


324801 


A1819924 


Hs.14553 


sterol O-acytttansfefase (aeyl-Coenzyine 


1.00 


6.56 


324804 


A1692552 




gl).Ti»d73f12jc1 Na_CG*P_lu24Homosa(Mens 


1.00 


7.53 


324828 


AA843926 


Hs.124434 


ESTs 


2.00 


3.25 


324855 


AW152305 


Hs.1 22364 


ESTs 


2.74 


143 


324866 


A1541214 


Hs.48320 


Small jSoCne-rich proldn SPRK [htBnan, 


1.07 


a95 


324871 


AW297755 


(&.271923 


Homo sapiens cONA: FIJ2278S fis. done K 


1.68 


1.21 


324886 


AAB06794 


HS.131S11 


CSTs 


Z56 


5.61 


324889 


D31010 




glKHUMLIZU? Human iaial lung HonM sa|de 


2.20 


4.65 


324948 


AW383818 


HS.2654S9 


ESTs. Moderately siirilar to AUJ2_HUMAN A 


SJ& 


7.05 


324953 


AI264628 


Hs.125428 


ESTs 


331 


5.51 


324958 


AAB2a)76 


Hs.132892 


pfotocadhefin 20 


5.12 


9.81 


324988 


T06997 


Hs.121028 


hypothetical protein FU10S49 


2J52 


1.08 


325024 


F13254 


Hs.78672 


laminJn, alpha 4 


62* 


10.22 


32510S 


H97109 


Hs.105421 


ESTs 


1.00 


1.00 


325108 


AM018S3 


Hs^BO 


ESTs 


1.99 


Z14 


325114 


D83901 


HS.31S562 


ESTs 


zn 


ai7 


325146 


AI064690 


Ks.171176 


ESTs 


1.86 


3.41 


325149 


061117 


H3.t87646 


ESTs 


0.42 


ass 


325187 


AI6536a2 


Hs.197812 


ESTs 


6.50 


11.31. 


325228 








6,18 


15.76 


325235 








2.64 


4.12 


325328 








2.87 


4.42 


325340 








0.29 


0.33 


325367 








16.56 


24.29 


325373 








0.63 


1.22 


325389 








asB 


105 


325436 








5.75 


14.14 


325471 








8.48 


17J2 


325498 








3.32 


6.42 


325557 








5.51 


8.28 


3255S9 








7.48 


21.40 


32S60 








4.08 


6.25 


325569 








4.20 


5.24 


325585 








1.10 


1.13 


325587 










1010 


325597 








Z98 


13.40 


325639 








aTB 


0.78 


325685 








0.46 


0.66 


325686 








0.95 


1.55 


325735 








4.48 


9.20 


325739 








0.59 


a88 


325740 








2.42 


6.61 


325792 








7.88 


9.83 


325819 








4.74 


7.18 


325883 








2.02 


2.64 


325895 








7.78 


15.98 


325925 








Z04 


10.60 


325932 








4.18 


7.36 


325941 








3.66 


9.03 


325969 








0.61 


0.80 


325971 








4.88 


7.42 














328046 








Wl 


14.72 


326099 








3.60 


5.98 


326108 








1.27 


^xa 


326163 








3.27 


5.70 


326165 








0.45 


1.11 


326189 








0.13 


0.45 










5.60 


9.00 


326230 








7.00 


12.01 


326274 
326360 








1.00 
9.8S 


8X9 
(&3S 


326393 








as2 


0.77 


326505 








1.00 


1.42 


326515 








1.24 


5.84 


326589 








a20 


13.49 


326592 








i77 


4.01 


326605 








2.01 


2.53 


326692 








1.00 


1.00 


326693 








1.00 


«1 


326720 








0.19 


0.65 


326742 








2.34 


7.20 










a25 


0.83 


326818 












326936 








2.08 


145 


326964 








a4i 


1.70 


326983 








2.02 


3.80 


326991 








1.09 


1.20 


327036 








1.00 


8.04 


327040 








3.05 


4.22 


327053 








3.55 


6.31 


327075 








1.59 


1.40 
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327130 
327156 
327220 
5 327224 
327288 
327321 
327332 
327361 

10 327377 
327396 
327414 
327442 
327467 

15 327473 
327483 
327562 
327568 
327608 

20 327611 



327734 
327775 
25 327796 
327840 



30 



35 328157 
328196 
32B197 



55 



26.47 
14.56 
10.22 



70 



329134 
329157 
329178 



80 
85 



139 



wo 02/086443 



PCT/US02/12476 



329860 
329993 
330020 



ia77 

14.21 
13.12 



10 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



330263 
330300 
330313 



330385 AA449749 

330397 014659 

330468 L10343 

mm L24203 

330478 L38466 

330493 M27826 

330495 M31328 

330S0S M61S06 

330512 M80563 

330537 U19765 



3305% U90437 

330601 U90916 

330805 X02419 

330609 X04741 



H5.ia2971 

HS.1S43B7 

Hs.1 12341 

Hs.82237 

Hs.296049 

Hs,267319 

Hs.71642 

Hs.6241 

HS.812S6 

Hs.2110 

Hs.183671 

Hs.299a67 



H5.B2845 
Hs.77274 
Hs.76118 



85 



330660 AA347B68 

330692 AA01704S 

330707 AA133891 

330715 AA233707 

330717 AA233926 

33am AA243S60 

330740 AAffl7746 

330742 AA400979 

330744 AA406142 

330751 AM28288 

330760 AA448663 

330763 AA450200 

330786 D60374 

330790 T48538 

330814 AA015730 

330827 AA040332 

330844 AA0SU37 

330901 AA1S7B1S 

330931 F01443 

330952 H02aS5 

330961 H10998 

330968 H16568 

331014 HgS^ 

331046 Na563 

331Q60 N75081 

331099 R36671 

33110S R41408 

331131 RS4797 

331135 RB1398 

331170 T23461 

331180 T3244S 

331183 T40769 

331203 TB2310 

331271 AA059347 

331306 AA252079 

331327 AA%1076 

331341 AA303125 

331359 AA416979 

^1383 AA421S62 

331378 AA448a81 

331384 AA456001 

331402 AAS05135 

331422 F10B02 



karyopheiln alpha 5 {Impoilln alpha 6) 
KIAA0103sene product 
protease lnhi)iitor3, skiivderived (SKAL 
ataxia-telangiectasia group D-assoclated 
nmcTDfibrillsr-associated protein 4 
endogenous retroviral protease 
guanine nucteofide Unilng protein (G pr 
phoGphotnosilida-3-kinasa, negiilaJory su 
S100 c^dum-Wnding protein A4 (calctuni 
zinc finger protein 9 (a celtuiar retrov 
liyptophan 2,3-dtoxygenase 
hepalocyte nuclear factor 3, alpha 
(NONE) 

gb:Human RP1 homolog mFNA, 3UTR region 
Homo sapiens cDNA: FU21930 fis, done H 
plasminogen activator, uroldnase 
ubiqullin carfaoxyl-lerminal esterase LI 



S100 cddum-bindiiig prot^ A2 

"w^^^ndarlo ALU7JiUMAN AIAI S 



Homo sapens cDNA FU1 1570 fis, done HE 

Int8giin,beia8 

ESTs 

Homo sapiens voilage-galed sodium channe 




Hs.274337 
Hs.49136 
Hs.105807 



Hs.12744 
HS.N8Q3 
Ks.%7319 



hypotheticai protdn FLJ20686 

ESTs. Moderately similar to AUi7_HUMAN A 

ESTs 

ESTs, Wealdy slmiiar to transtarniaOon-r 



Hs.7164 

Hs.23748 

Hs.30340 

Hs.191358 

Hs.157148 

Hs.B3g37 

Hs,21983 

H5.4197 
Ks.159293 
Hs.6640 
Hs.8469 



hypolhetica] protein KIAA1165 
ESTs 

Homo s^ens cONA FU1 1883 fis. dona HE 



gb:nS7b073l Soares infant brain INtB H 

ESTs 

ESTs 

Human DNA sequence from PAC 75N13 on chr 
ESTs 
(NONE) 



Hs.46901 
Hs.91011 
Hs.49282 
Hs.93847" 
H8.44037 
Hs.163628 



daehslnmd (Oiosophila) homolog 
ESTs 

Homo sapiens cONA fOJ 13496 (is. doi» PL 
KIAA1462 protein 

anterior gradient 2 (Xenepus iaevls] horn 
hypothetical pretdnFU11088 
NAOPH oxidase 4 
ESTs 

E8Ts,l»oderatelysindlartoALU7_HUMAN 



5 

10 



20 



40 
45 
50 
55 
60 



WO 02/086443 
331480 N32912 Hs^6813 
331531 NS1343 
331S47 N54811 



331SB9 N71027 

331608 N89861 

331614 N92293 

331668 W69707 

331671 W72033 

331676 W7g834 

331681 W85712 

331692 W935g2 

331717 AA190888 

331718 AAmm 
331811 M404500 
331820 AA405970 
331831 M412031 
331852 AMI 8988 
331943 AA453418 
331969 AA460702 
331990 M478102 



332027 AA489671 

332029 M46g6g7 

332033 M489840 

332048 M4960ig 



Hs.152618 
Hs.112110 
HS.24Q272 
H5.S8C30 
Hs.194895 
HS.58SS9 
Hs.1 19571 
Hs.152213 
Hs.153881 
H8.104072 
Hs.301570 
Hs.97996 
Hs.97901 
Hs.g3314 
HS.2127S 
Hs.82772 
Hs,139631 
Hs.105104 
Hs.65641 
Hs. 145053 
Ks.251014 
HS.201S91 



332074 AAS9g012 

332083 AA600200 

332085 AA6003S3 

332125 AA609S61 

332177 F10812 

332180 H03348 

332185 H103S6 

332203 H49388 

332232 N48891 

332240 N548Q3 

332261 N70294 

332275 R08838 

332280 R38100 

332299 R69250 

332304 R74041 

332314 T25B62 

332384 M11433 

332434 N7S542 

332445 T63781 

332453 UI020S 

332458 M33493 

332504 AA053917 

332525 M17252 

33^ M31682 

332535 N202B4 



C0A14 

gbvz15g04.s1 Soares_mUlliplej«derosIs_ 
gb»d74fa4^1 NCI_CGAP_0v2 Homo sapiens 
ESTs 
ESTs 

PTD007 pralaln 
EST 



berl 

ESTs, matf sMIar to rfaotel^ (hLmusc 
coltagen. type III. dpha 1 (Ehters^ 
wingless-type MMTV kdegraBon site biri 
Homo sapfens NY-REN-62 anl^en mRNA, par 
ESTs 
ESTs 

naBon factor, mitoc 



PCT/US02/12476 



332559 Mt395S 



332638 AA2B3034 

332640 AA4171S2 

332654 AA001298 

332665 AA22333S 



Hs.1 55548 

Hs.173933 

Hs.312447 

H5.101433 

Hs.732r 

Hs.101689 

Hs.317769 

Hs.101915 

Hs.324ffi7 

Hs.269137 

Hs.26530 

Hs.1 46381 

Hs.21201 

Hs.101539 

Hs.101774 

Hs.101850 

Hs.289068 

Hs.1 1112 

Hs.1 11758 

Hs.250700 

Hs.15108 

Hs.278430 

Hs.1735 

Hs.19280 

Hs.20183 

Hs.166ie9 

Hs.274407 

Hs.25272 

Hs.3239 



Homo saddens mRNA; cDNA OKFZp5B6lJ0120 (f 
hypothetical protein FU1101 1 
conagen.lype 30. alpha 1 



EST 
ESTs 

KIAA1211 protein 

gb:ae41e11.s1 GesslerWlmslumor Homos 
KIAAIoeo proieta: Golgl-asaoclatBd. gamm 

nuclear factor VA 
ESTs 
ESTs 



ESTs 
EST 



inl 



ESTs, Weakly sMIa- to putath/e pISO [ 
ESTs 

seium deprivation response (phosi^iafldyl 
KHA binding motif protein, X duDmosome 
nectin 3; OKF=ZP56680846 protein 
ESTs 

hypothetical protein RJ23045 

relinoM)lnding pralein 1, cellidar 

Homo sapiens cONA FU1 1918 lis. clone HE 

ESTs 

keratin 6A 

byptaselietal 

chromosame 14 open reading tirame 1 
cytodvome P450. sulifamily XXIA steroid 
Inhitdn. beta B (acfivin AB beta polypep 
cysteine-itch motor neuron 1 
ESTs, Wealdy sirollarto AF164793 1 prote 
cytokaralin2 



332781 AA233258 



Hs.50640 

Hs.5101 

Hs.288217 

Hs.637a8 

Hs.247926 

Hs.79070 

Hs.1 14765 

Hs.296938 

Hs.247112 



InXA 
JAK landing prolan 
protein regulator of cytoldnesis 1 
hypothetical protein MGC2941 
propionyl Coenzyme A carixaylase, beta p 
gap junction pmtein, alpha 5, 401(0 (oon 
v-myc avian myeloc^tomatosis viral onoog 
myetotdyiymphald or mlxed-Tineage leukeni 
dual speciild^ phosphatase 7 
hypothetical protein HJ10902 



332911 
332912 
33:S22 



141 



wo 02/086443 



PCT/US02/12476 



334094 
334113 
334161 
334183 



334616 
334633 
334648 
334787 



142 



wo 02/086443 



PCT/US02/12476 



336716 
33672t 
336798 
336900 



337128 
337162 
337183 
337184 



338179 
338182 
338ira 
338197 



143 



wo 02/086443 



PCT/US02/12476 



338374 
3384U 
338418 



338671 
338676 
3M726 



338871 
338872 ■ 
338879 



339047 
333100 
339114 
339121 
339170 



TABLE 8B shovrettieacrasslonrajmbersforOMWePteyshTaUeBAIaBMiiBUnlgenelO^ For each piobasel we have listed Bie gene eteter number fe>m which Bib 
oSgonuclaoUdes were designed. Gena chisists were compBed using sequences derived finm Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
simllsnty using auslering and /UignmentTools{DoultoTwlst Oakland Caliibnda). The Genbank accession nimibers for sequences comprising each cluster are listed In Bie 
'Accession' column. 



Pkey: 
CAT numb 


Unique Eos proiieset identifier nunnl)er 


Accession: 


Genbank act 


session numbers 


Pkey 


CATnumbei 


Accessions 


322044 


187363_1 


AW340g26 AA249053 N86075 


322060 


44320_1 


AI341937 AW003063 U34725 AA904742 


321430 


42705_1 


X57414X57415 


321467 


43034_1 


X13075X13076 


32212S 


46779_1 


R93901AF075073 R93902 


322166 


46861_1 


H69434AR)85gS8H6984S 


322173 


46873 1 


H52S67 H52557 AFD85970 H52164 


322178 


46882_1 


H56535yy^085980 H5S712 


322179 


46885J 


H92891AF08a82H92777 


321577 


1615102.1 


H84849 H84252 H84260 H86684 H853Z0 


321587 


1615333 1 


H95531 H9S521 HS4529 


313723 


111953J 


AA070412AA102346AA081885 


320997 


62749^1 


H22S44H46842AI204929 


322278 


47271J 


W69304AiH)86283W69200 




218439_t 


AA62S149 AA313030 AA313052 H97463 


313883 


129439J 


AA665aa9 AA13S130 AA484059 AA10241 9 AW877765 




47422.1 


W79150AF086419 


322339 


814584.1 


AI668646A1734214W17348 


314648 


293660J 


AW97926SAA878419M431342AA43ie2B 


300201 


682222.1 


AI308300AI308296 


308897 


25196.-2 


AI093967 


323155 


979809.1 


ALiaOTOI AL13S041 AL121S24 


322527 


3B927_1 


AF147359TS8511T5aS60 


322585 


473768.2 


W88919 W89125 


300362 


1574395.1 


242308 H23514 


322635 


82296.1 


AA0051 29 AA679084 AA694399 


322664 


85042.1 


AA01 1522 AA702841 AA01 1691 AA330797 


31S4S4 


380580.1 


Ai239464 AI239473 AA62ai2 AI208703 


322687 
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1923101-1923205 








1961 767-1961358 








1962120-1962246 








2X)9620-2009738 








2510526-25106S8 








2518145-2518213 








3369205-3369323 








3369495-3369571 








3978070-3978187 








4904775-4904846 








4910935-4910997 
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5612620-561 2780 








6234778-6234894 








6562391-6562566 








662t)584-6620903 




(junnam, L ei.al. 




6629004-6629233 




Duntiam, L etat. 




6796852-6797128 




Duidianii 1. et^. 








Donham, 1. et.al. 




7608165-7608234 




Ounham, 1. etal. 




7S92491-7692630 




Dunham, 1. et.aL 




7694407 -7694623 




Dunham, 1. eLal. 




7695440-7695697 




Dunham, i. eLal. 




7696625-7696707 




Ounham, 1. etal. 




7706773-7706902 




Dunham, 1. etal. 




7746805-774691 6 












Dunham, I. sLsH. 




B1S396l>4lS4161 




Dunham, 1. etd. 




8154882-815S02S 




Dunham, t etal. 




8156437-8156709 




Duidiam, 1. etal. 




81 56825-81 57001 












ijunham, 1. etal. 




6oo31 So-o5a3o35 
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334378 


Ounham, 1. etal 


Plus 


13907239-13907370 


334382 


Ounham, 1. etal. 


Plus 


13915866-13916036 




Duntiam, 1. etal. 




1 4987847*1 4987940 








1S032740-1S032B17 








15176123-15176470 








15333206-15333305 








1887221 4-1 W723 1 7 








19299770-19299944 








20103970-20104058 


33501S 


Ounham, 1. etd. 


,Pius 


20682792-2068^5 


335120 


Ounham, 1. etal 


P!us 


21436286-21436384 


33512S 


Dunham, L etal 


Plus 


21441390-21441471 


335179 


Dunham,!.^ 


Pkis 


21634405-21634526 


33St8S 


DunhataL^al. 


PhB 


216ffl1ie-216S932S 


335211 


Dunham, I. etal. 


Plus 


21774611-21774680 


335361 


Ounham, t etal. 


Plus 


22807292-22807445 


33S379 


Ounham, i. etal. 


Plus 


22899306-22899420 


335414 


Dunham, 1. etal. 


Phjs 


23235546-23235684 


335416 


Dunham, I.Aal. 


Plus 


23237354-23237465 


335496 


Dunham, t etal 




24164386-24164545 


335497 
335558 


Dunham, tctaL 
Dunham, L etal. 


Phis 
Plus 


24167666-24167869 
24740167-24740347 


335586 


Dunham, t etal 


Phis 


2499033324990497 


33K88 


Dunham, t etal. 


nus 


25439839-25439920 


335784 


Dunham, 1. etal. 


Bus 


25942710-25942792 


335823 


Ounham, 1. etal 


PhJs 


26365925-26366004 


335983 


Dunham, L etal. 


Plus 


27938968-27939070 


335995 


Dunham, 1. etal. 


Phis 


28009044-28009184 


336021 


Dunham, I. etal 


f^us 


2B68648^2e6865S9 
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Plus 


29014404-29014590 






PJus 


^022963-290231 65 






Plus 


29987731-299878^ 






Plus 


983890-985529 


336633 


Dunham, 1. el.al. 




985591-986221 


336634 


Ounliam, 1. eld. 


Plus 


986296-986670 








987908-988364 






Plus 


988418-989185 






Plus 


989276-990813 






Plus 


991906-993240 






Plus 


1896402-1896478 




Dunnani, L eiai. 


Plus 


2420546-2420616 








3371522-3371586 


336900 


Dunham Utal. 


Plus 


10236423-10236523 


336948 


Dunham, LeLaL 
^nham, 1. 6ta). 


Plus 


12692290-12692381 
1664481 7-1 6644942 




Duj^iani, 1 6tal. 


Plus 


17821742-17821922 






Plus 


23478943-23479145 


337183 


Dunham, 1. elal. 




23943606-23943696 


337184 


Dunham, 1. elal. 


Phis 


23373949-23974016 








28011979-28012034 






Plus 


29022656-29022775 






Plus 


31401509-31401579 




uunnafn,-LeLai* 
Ounhai\ 1. alal. 


Pha 
Plus 


33330760-33330981 
34474472-34474531 




Dunham, h elal. 


Phis 


39717644971900 






Rus 


4449069-4449193 




uunnajn, I. atal. 




5443027-5443101 
6969162-6969270 




Dunham, 1. elal. 
Dunham, L etal. 


Plus 
Plus 


7697068-7697236 




Dunham, L etal. 


PUIS 


8092128-8092271 




Ounhanib 1. alaL 




10384481-10384621 


338112 


Dunham, LeLaL 




10391398-10391600 


33B145 


Dunham, LeLd. 


Plus 
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Dunham, L eLaL 




1 1448985-1 1449085 




Dunham, L elal. 


Phis 
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Dunham, L etal. 


Plus 
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Dunham, L etal. 
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Dunham, L etat 
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18082184-180S2402 




Dunham, L eLaL 
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Dunham, L etat 
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21244713-21244828 




Dunham, 1. etal. 


Plus 
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Dunham, I elal 


Phis 
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OuRham, L ti-flL 
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24404720-24404899 




Ounhan^ LeLaL 


Phis 
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Dunham, 1. eLd. 


Plus 


27792166-27792272 




Dunham, 1. etsL 
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Dunham, 1. ^aL 


Pius 


29160665-29160725 




Dunha/n, 1. etal. 
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30077787-30078184 


339047 


Dunham, I etal. 


Plus 


30760793-30760968 


339100 


Dunham, 1. etal. 


Rus 


31141580-31141765 




Dunham, t eLA. 




31466454-31456519 


339121 


Ounhan,Letal. 


Plus 


31583467-31583536 


339170 


OunhoaleLsL 


Plus 


32216399^16527 




Dunham, LeLaL 






332858 




Mmis 


1339607-1339397 






Mnus 


2628296-2628109 






Minus 


2632606-2632457 




Dunham, L etal. 
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2711704-2711565 
3028925-3028811 




Dunham, L etal. 


Minus 
Minus 


32041244204038 




Dunham^ L etal. 


Minus 


330844G43083S8 




Dunham, L 
Dunhani L eLal. 


Minus 
Minus 


33a95gM3a9531 
3310817-3310749 






Minus 


3377220-3376309 








430840M308304 






Minus 


6466335-6465727 






Minus 
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Minus 
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333906 


Dunhan. L etal. 


Minus 
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334183 


Dunham, 1. etaL 


Mhius 


11832582-1183^ 


334187 


Dunham, L elal. 




1 1921456-1 192120S 


334222 


Dunham, L eLal. 


Minus 
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334223 


Dunhan, 1. etal. 


llAtus 


12734365-12734269 


334255 


Dunham, L etaL 
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334492 


Dunham, L etal. 
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Dunham. L etal. 
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Dunham, L etaL 
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Dunham, t elal. 


Minus 
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334945 


Dunham. 1. eLaL 




334967 


Dunham, 1. et.al. 




334990 


Dunham, 1. e1.d. 


Mnus 20341159^20341087 


335093 


Dunham, 1. ^al. 


Minus 2127367-21297214 


335288 
335289 


Dunham, L elal. 




33SS48 


Dunham, 1, eUd. 
Dunham. 1. eld 


Minus 2Z3uo9S0-2Z30570o 
Mnus 24652773^4662673 


335K1 


Dunham,!, elal. 


Minus 24679628-24678961 


33K19 


Dunham, L eLaL 


Minus 25082677-25082498 


335620 
335621 


Dunham, \. eLaL 


Minus 25092561-25092434 


335682 


Dunham, 1. aLal 
Dunham, 1. et.aL 


Minus 25098878-2S098767 
Minus 25421215-2S421093 


335755 


Dunham, L eLaL 




335814 


Dunham, L eLai. 




33S81S 


Dunham, L et^al. 


Minus 2632051 8-2S320421 


33S835 


Dunham, L aLal 


MiTuis 2639331 1-28393245 


3358S1 


Dunham, L eLaL 


Minus 26604863-26604742 


3358E8 
3358SS 


Dunham, L eLaL 


Minus 26711437-26711300 


335936 


Dunham, L eLsi. 
Dunham, I eLai. 


Mhius 26977639-26977553 
Minus 27360474-27360400 


33594S 


Dunham, 1. eLai. 


Minus 27555924-27555788 


336066 


Dunham, L eLaL 


Minus 29241080-29240842 


336205 


Dunham. L eLaL 


Minus 30477456-3047731 1 


336275 


Dunham, 1. eLai. 


&£nus 32086675-32086536 


3^92 


Dunham, \. eLsl. 




336331 


Dunham, L eLaL 


Minus 00094527-33594371 


336419 


Dunham, L et.aL 




336675 


Dunham, L eLai. 




336684 


Dunham, 1. eLai. 


fi/^us 2158060-2157993 


336716 


Dunham, L eLai. 


Minus 3259952-32S9B62 


336798 


Dunham, 1. eLai. 


Minus 5588954-5888757 


337043 


Dunham, 1. eLai. 


hfinus 17407330-17407251 


337046 


Dunham, 1. eLai. 




337128 


Dunham, 1. eLaL 


Minus 22215251-22215034 


337192 


Dunham, 1. eLai. 


Minus 24591853-24591771 


337194 


Dunham, \. eLai. 


Minus 24610510-24610359 


337229 


Dunham, L eLai. 


Minus 26716579-26716481 


337325 


Dunham, L eLai. 


Minus 30015948-30015800 


337497 


Dunham, 1. eLaL 


Minus 33371317-33371258 


337500 


Dunham, 1. eLai. 


Minus 33376212-33376158 


337603 


Dunham, L eLai. 


Minus 1299296-1299194 


337605 


Dunham, L eLai. 


Minus 1346555-1346397 


337671 


Dunham, L eLaL 


l^s 3260634-326^47 


337786 


Dunham, L eLaL 


Minus 41332034133081 




Dunham, L eLai. 


Minus 53476^5347550 


338083 


Dunham, L eLai. 


Minus 9318438-9918301 


338158 


Dunham, L eLai. 


Minus 11794465-11794343 


338161 


Dunham, L eLai. 


Minus 12124716-12124658 






Minus 12824919-12824827 


338189 


Dunham, 1. eLai. 


Minus 12878594-12878478 
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Dunham, L et.aL 


Minus 13760865-13760780 


338215 


Dunham, 1. eS.a]. 




338469 


Dunham, i. eLai. 


Minus 20S20387-20520242 


338549 


Dunham, I eLaL 


Minus 22049171-22049081 






Minus 2231 1966-22311856 


338671 


Dunham, L eLai. 


Mnus 24508421-24508346 


338876 
338726 


Dunham, L eLai. 
Dunham, I eLai. 


Mnus 24637427-24637369 


338779 


Dunham, 1. eLaL 


Mnus 25926206-25925618 
Minus 27030151-270S^79S 


338871 


Dunham, I e».al. 




338872 


Dunham, 1. eLai. 


Jranus 2oouv921-2830079U 


338986 


Dunhosn, L eLai. 


Mnus 29614876-29614749 


339229 
339264 


Dunham. L eLaL 




325228 


Dunham, L eLaL 
6381940 PhB 


Mnus 32975145-32975053 
2630-2694 


32S235 


6381943 M-nus 


1621S4-162264 


329588 


3962484 Plus 


lira-1619 


329560 


3982491 Phis 


2095-2990 


329341 


3983503 Minus 




325328 


5S6SS7S Rus 


6o7oU-o6854 


32S340 


6017033 Minus 


166656-166819 


325373 
32S367 
32S389 


5866920 Mbuis 

5866920 Miittis 

5866921 Phis 


1 136686-1 1 36777 
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325436 
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SBG6939 Mnus 
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325560 


6243S95 Minus 
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325569 


6249599 Plus 


79927-80217 


3^7 




126724-126967 




Mffi4ffl Phis 


73476-73574 


325597 
325639 




1065O2O-1O8S089 
2S3525-253808 
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325739 5867038 Minus 

325740 5867038 Minus 
325792 6469828 ' Minus 
32573S 6SS2447 Minus 
325685 6682468 Plus 
32S688 6682468 Plus 
325819 6682490 Minus 
329764 6048195 Minus 
3^703 6065793 Minus 
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5867124 Plus 
5867127 Plus 



325941 5867133 Minus 

325969 5867163 Phis 

326971 5867153 Plus 

329993 4567166 Minus 

330020 6671887 Plus 

326163 5867168 Minus 

326274 5867171 Minus 

326025 5857176 Plus 

326046 5867182 Minus 



326108 
3^165 
326189 
326204 

330052 
330036 
326350 
326589 
326393 
326505 
326515 
326592 
330107 
330108 
330100 
330093 
330088 
330085 
330120 
330123 
326742 
325605 
326818 
326720 
326770 



327040 
327053 
327075 
327085 
327036 
327130 
327156 



327220 
327224 
327321 
327%1 
327396 
327414 
327442 
327467 
327473 



5867187 Minus 

5867208 Minus 

6887212 Plus 

5867218 Minus 

5867230 Minus 

4557182 Plus 

6042048 Plus 

5867293 Plus 

5867320 Plus 

5867341 Plus 

586743S Minus 

5867439 Plus 

6138928 Plus 

6015243 Minus 

6015249 Minus 

6015253 Rus 

6015278 Plus 

6015293 Plus 

6015302 Minus 

6671864 Minus 

6671869 Minus 

5867611 Minus 

5867637 Phis 

6117831 Minus 

6552456 Plus 

6598307 Minus 

66B2502 Plus 

6682502 Minus 

5867657 Minus 

5867660 Plus 

6004446 Minus 

6469836 Plus 



6531965 Plus 

6S31965 Plus 

6531965 Plus 

6531976 Plus 



5867743 Plus 

5867750 Plus 

5867759 Plus 

6867772 Plus 

5867775 Plus 

5867783 Plus 

5867793 Minus 



S867891 Minus 
58S7910 Mtaus 
5867940 Minus 



207533-207690 
1018-1176 
269122-269190 
117397-117483 



139994-1401^ 
53403-53537 
70295-70423 
163474-163605 



358317-358476 

115749-115962 

7369-7441 

64228W02 

101911-102081 

105841-KB035 

101307-101434 

172397-172491 

7831-8035 

410289410404 

70854-70915 



661381-661510 



69288-6S413 

148088-148200 

301868^1972 



117120-117216 

13627-13844 

22760-22919 

4170241841 

8818-8949 

36683-36809 



100091-100282 

99443-99778 

21166-21301 

1043-1199 

37517-37638 



127SS3-1 27656 

35311-35408 

95187-95248 

24656-24749 

15199-15309 

84525-84677 

513803-513668 

117697-117899 



16023-16581 

18147-18339 

10217-10357 

75340-75456 

783670-783817 

2247267-2247437 

4041318-4041431 

4734947-4735069 

319951-320040 

20247-22343 



4858348773 
S6361-S8532 
65701-65781 



61013-62130 

8702-«820 

102461-102586 

111483-111618 

88030-88151 

75101-75181 

181573-181662 

37610-37676 

343969-344474 

46152-46287 



179063-175392 
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130791-130871 








85267-85405 








73085-73208 




6013539 




66517-66931 








101503-101634 


328004 






157407-157887 


328101 


5868020 


Plus 


289920-290014 


328100 


586B020 


Mnus 


263545-263635 




586B024 




80378-80491 




5868064 




73326-73615 




S868080 




16551-16729 




5868081 




42133-42438 








9S240-95428 




5866216 




66611-66677 




5902482 




713478-714590 








253903-254022 
5S0a&-S5404 








3246-33)2 








87770-87953 








38889^10 




56od2o9 




293920-294224 


328623 


5868246 


Mtas 


120020-120126 


328632 


5868247 


Plus 


76734-76853 




5868254 












625555-625633 


328700 






764089-764203 








68114^8854 




5868289 




89389-69455 




5868289 




274638-274726 








29408-29684 








149708-149889 
59955-60094 








270724-270798 








7S371-75S83 








662758-662848 








217275-217336 








8987-9180 








59098-59481 








334973-335406 








1193739-1193866 








108317-108403 








117002-117059 




SBdoouu 




771755-771889 








846342-846448 






Minus 


43552-43619 




6042030 




33642-33775 








85470-85673 








151837-151914 








317461-317688 








5390-5479 








3246642562 








146417-147652 








29959-30018 


329157 


SB68687 


Minus 


145940-146155 


329178 


5868704 




179177-179463 


329192 


5868716 


Plus 


166936-167020 


329194 


5868716 




3044S0-304559 


329204 


5868720 


Minus 


3050-3190 


329224 


5868728 


Plus 


27422-27664 


329228 


5668728 


Minus 


50116^0287 


329288 


5868771 


Plus 


25554-26299 


329337 


5868806 




46715M67222 


329011 


6682532 


Plus 


48658-48741 
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TABtESA: Potential Tlwrapeidic, DiagnosUc ami PragnosBctaigeb for Therapy of 

T*le 9A slnws alMut 1312 genes ufHeguIated fa iung tumors Onduding squannus eel caicinoinas. adenoeana-nomas, smaH oeO cardnomas, granukwialous and carcinoid 
^ tumois) relative to normal Ixidy tissues. Tliese genes ware setectedttofflalMut 59680 prabesets on the EosM^etiixH^^^ 

Table 9B show tha accession numbeis tor those Pke/s laddng UnlgenelO's for tatile 9A. For each probasst we have listed the gene duster number from vMch the 
otgonudeofides were designed. Gene dusteis were oompBed using sequences derived ftoffl Genbank ESTs and miWAs. These sequences wen clustered based on sequence 
sitrtlaiity using Qusterfng and Alignment Toob (OoubieTviist. Oakland Cd8bnda). The Genbank accession numbeis forsequences comprising each duster are Bsted In the 



10 

Tabte9C show 8ie genomic posifioning for those Pksy^ laddng Unlgene ID'S and accesskn numbeis in tabb For aachpredkded axon, we have feted the genondc 
sequence souice used for predkHion. Nudeotkfo tocatnns of each piBdicled emn are also listed. 

15 Pl<ey: Unique Eos proliesetklentilieraunber 

ExAccn: Exemplar Accesdcn number, Genbank accession mnber 

UnigenelD: Unigene number 
Ltnigene Title: UragenegeneGUe 

R1: Average of lung tumiis(|nduding squannus cdcaidnafflas.adenacardnamas,8m^ 

R2: Avc^tfmivff^ 

Pkey ExAccn UntgenelO Unigene roie R1 B2 

400195 NM.007057':Homo sapiens ZWIOinterador 1.00 1.00 

25 400205 NM_006265':HomosaplensRAD21(S.ponibe) 15.80 396.00 

400220 Eos Control 2.28 2.84 

400277 Eos Control 7.68 9.72 

400285 Eos Control 1.00 1.00 

400288 X06256 HS.149B09 integrin. alpha 5 (fibronacBn receptor, 1.04 2.24 

30 400289 X07a20 H5.2258 matrix metallaprotelnase 10 (stromslysin 132.4S 4.00 

400298 AA032279 Hs.6163S six transmembrane epitheHal aniigen of 43.88 74.00 

400301 X03635 Hs.1657 estrogen receptor 1 1.00 1.00 

400303 AA2427S8 Ms.79136 UV-1 protein, estrogen regulated 1.75 1.65 

400328 X87344 Hs.1 80062 transporter 2, ATP-binding cassette, sub 0.87 1.80 

35 400419 AF084545 Target 156.55 253.00 

400512 NM_030878^Homo sapiens cytochrome P450, 1.00 2.00 

400517 AF242388 lengsln 3.67 87.00 

400560 NU_030a78':Homo sapiens cytochrome P45a 1.00 1.00 

400864 NM_002425:Homo sapiens matrix mstallopra 20.26 45.00 

40 400865 NM_002425;Homo sapiens matrix mstallopro 1.38 1.07 

400666 NM_002425:Homo sapiens matrix meyiopro 3.28 3.22 

400749 NM_003105*:Homo sapiens sorfflin-related 1.00 91.00 

400763 Target Exon 7.63 24.00 

401027 Target Exon 1.00 1.00 

45 401093 C12000S86-:gq63301671dbpAAB6477.11(A 1.00 155.00 

401203 Target Exon 1.00 86.00 

401212 C12000457':gI|7512178|pirP30337palypr 1.00 40QJ» 

401411 EN8P00000247172*«YPOTHEnCAL 126.2 kOa 1.00 72X» 

401435 C14000397*:gil7499898|pirt[T33295hypoth 1.00 64.00 

50 401464 AF039241 h'stone deacetylase 5 3.82 49.00 

401714 EN5P00000241802'£ONAFU11007nS,aON 2.02 40.00 

401747 Homo sapiens keratin 17 (KRT17) 128.43 68.00 

401760 Target Exon 1.74 35.00 

401780 NM 00S5S7':Homo sapiens keratin 16 (roca 26.47 10.50 

55 401781 Target Exon 10J3 4.61 

401785 NM 002275*11amo sapiens keratin 15 (KRT1 4.13 2.70 

401797 Target Exon 1.44 ZW 

401961 NM_021626:Homosaidens serine carboxypep 1.41 1.86 

401985 AF053004 dass I cytokine < on 177 ™ 

60 401994 Target Exon 



402260 NM_001436':Homosa|taisfa]rilIai1n(FBL 

402265 Target Exon 

402297 TargdExon 

65 402408 NM_030920*« 

402420 C1(n0823^gl|10432400|and)|CAC102Sai|(A 

402674 Target Exon /•'m <<m.iiu 

402802 NM_001397:Kamo sapiens endotheDncanver 1.00 70.00 

402994 NM-002463*:Homosa|taismyxovinisGnllu 1.37 1.43 

70 403137 NKLOOSSai'iHomo sapiens nudeoBn (NO], 1.00 19.00 

403306 NIWL00682S bansmerobsna protein (63kCI9,endoplasnd 1.00 43.00 

403329 Target Exon 1X0 61.00 

403381 ENSP00000231844*£cob<iqiiovinr3 Integra 1.00 11&00 

403478 Nti^022342:HomO8q)iBrBkine^p(otoin9 2B.13 136.00 

75 403485 C3001813*:gl|12737279lf^_m2163.1|k 20^3 76.00 

403627 Target Exon 6J0 2953 

403715 Target Exon 1.30 35.00 

404044 ENSP000002378S5*aU398G3.2(NOVaPROTEI 1.00 54.00 

404076 IWA_016020*«an» sapiens CGI-75 protein ( 14.29 91XJ0 

80 404101 C8000MO:tf|4235601pirl|A47318RNA4indl 1.00 1.00 

404140 NM.006510-itomo sapiens ret finger protei 1.42 1.44 

404165 ENSP00000244S62:NRHdeJ«drosenase{qulno 1.00 54.00 

404185 Target Exon 1.00 117.00 

404210 NM_005936Homo saddens myekAWymphoW 5.93 13.77 

85 404253 NM_Q21058'^omosaplensH2Bhlstonefaml 1.00 1.00 
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85 



404287 






C6001909:Sl|704441|dl|BAA1890ai| (D298 


29.71 


4^00 


404298 






CGa0123B*lsi|12171S|s|)|P26697]6rA3_ailCK 


IJO 


1.00 


404347 






Target Emn 

MM-021048:Homo sapims ntelanana anfigea 


1.00 


1.00 


404440 






1.00 


15.00 


404721 






NhL005598*:Homo sedans nuclear tator 1 


1.00 


60.00 


404794 


NVL000078 




chdesteryl ester Iranster pnttein, (das 


1.07 


1.38 


40«S4 






Target Exon 


1.61 


2.01 


404877 






Nl«1.00536&Hotno sapler» melanoina anfigen, 


1.00 


1.00 


404327 






Target Exon 


1.00 


1.00 










\M 


1.00 


4^« 






CY0lN047^|11427234|reip(P 009399.1 |z 






4055^ 






Nll4_Q31413':Hon» sapiens cal'eye syndrome 


m 


78.00 


405572 






Target Exon 


0.76 


1.14 


40K46 






C1%00200:gi|4S57225MNP_OQ0005.1| at 


1.01 


1.28 


405676 


BE335714 




cytochrome c-1 


1.13 


Z89 


405770 






NM_002362:Homo sapiens metanoma antigen. 


45.52 


37.00 


40S932 






C1500030S:g^3806122tgb|AAC69mi| (AFD 


1.99 


1.99 


406137 






NM_000179*:Kamo sa|sens mutS (E. coD] h 


2.77 


Z38 


4063S0 
406399 






Targ^Exon 


ixn 

1X0 


35.00 
39.00 


AI0SK7 






NluL003122^Hann sapiens serine piotsase 
TaigdExon 

InununoglotniBn lambda locus 


1.00 


1.00 


406621 


X57809 


HS.18112S 


1.41 


1.74 


406642 


AJ245210 




gb:Hon» sajdens mi^ for immunoglobulin 


2.16 


191 


406663 


U24683 


Hs.293441 


immunoglobulin heavy constant mu 


Z07 


2.93 


406671 


M129547 


Hs.285754 


met proto-oncogene piepdiocyte growth fa 


15.00 


51.00 


406673 


M34998 


Hs.198253 


m£^r htetocornpatibffity complex, dass 


0.98 


3.09 


406676 


X58399 


Hs.81221 


Human L2-9 tianscrlpl of unrearranged Im. 


1.30 


1.53 


406678 


U77534 




gb:Human done 1A11 tmmunoglobidn varia 


1J3 


1.45 


406685 


M18728 




gbiHuman nonspecific crossreacSng antig 


1.4S 


2^ 


4066S7 


M31126 


Hs^2822 


pregnancy specilic beta-l-glyoopratein 9 


aei 


8.50 


408690 


M29S40 


Hs.220529 


caminoenduyonic antigen-rdated cell ad 


226.37 


350.00 


406698 


X03068 


Hs.73931 


m^or hi5toconipaljl)9ity complex, class 


1.01 


2.52 


406815 


AAS33930 


Hs.2aB038 


tRNA Isopentenylpyrophosphats baisferas 


20.25 


Sim 


408851 


AA6097B4 




hklocompaH^ complex, class 


0.75 


1.91 


406964 


M21305 




gb:Human alpha salalilta and satellite 3 


38.15 


1114.00 


406967 


M24349 




gb:Human parathyroid hoiTnone.tilce protel 


1.00 


1.00 


406974 


M5733 




gb:Human paraUiyroid hormons-retated pep 


1.00 


1.00 


407103 


AA424881 


Hsi56301 


hypolheOcal protdn MGa3170 


1.77 


1.10 


407128 


R83312 


Hi237260 


EST 


1.00 


1.00 


407137 


T97307 




gb7e53M)5.s1 Scares fetal liver spleen 


142.70 


135.00 


407168 


R45175 


Hs.117183 


ESTs 


2.16 


18.00 


407239 


AA076350 


HS.67&46 




1.10 


1.57 


407242 


M18728 




gbiHuman nonspedlic crossreacfing antig 


1.12 


2.85 


407244 


M100U 


Hs.75431 


fibrinogen, gamma potypepSde 


3.24 


15.38 


407289 


AA1351S9 


Hs.2a3349 


Homo sapiens cI^NA HJ12149 (is. clone MA 


3.53 


3.68 


407300 


M102616 


Hs.1 20769 


gb2n43e07£l Stratagene HeLa cell s3 93 


19.74 


73.00 


407366 


AF026942 


Hs.271530 


gb:Honio sapiens cig33 ml^ partial sequ 


0.06 


8.25 


407378 


AA2g3264 


Hs.57776 


ESTs, Moderalety Marto 138022 hypot 


1.00 


28.00 


407430 


AF1693S1 




gb:Homo sa|dens protein tyrosine plnspha 


1.00 


25.00 


407453 


AJ1320B7 




gb:Konn safdens mRNA ibr axonamal dyndn 


1.00 


^00 


407577 


AW131324 


Hs.246759 


hypotheflcal prot»i MGa2538 


1.00 


1.00 


407634 


AW016S69 


Hs.136414 


UOP-GIcNAcbelaGal beta-1,3-H«»ty1glue 


111.K 


228.00 


407710 


AW022727 


Hs^3616 


ESTs 


1.00 


28.00 


407720 


AB037776 


Hs.38002 


KIAA1355 protein 


1.89 


1.31 


407746 


mximz 




hypothe&at protein inJIIlOO 


1.00 


1.00 


40T7S6 


AA1 10)21 


Hs.3a260 


ubiquilin sp»iSc protease 18 


4J1 


S.00 


407758 


DS0915 


Hs.33365 


KIAA012S gene product 


1.00 


2m 


407782 


AA608956 


HS.11X19 


ESTs. Moderately simBar to PUiWItilE Ca 


0.97 


1.14 


407788 


BE514g82 


Hs.38991 


S100 caUunvblridng protein A2 


7.M 


3.83 


407790 


AI027274 


Hs.28ra41 


Homo sapiens cDNA IU14866 lis, done PL 


3.63 


42.00 


407811 


AW190902 


Hs.400gB 


cyst^ 10x4 superfendy 1. BMP antagon 


89.96 


109.00 


407839 


AA045144 


HS.161S66 


ESTs 


173.91 


108.00 


407944 


R34008 


Hs.239727 


d8smoconin2 


111.30 


70.00 


408000 


L1 1690 


Hs.620 


bdlous pamphigdd anSgen 1 (230f240lcO) 


151.17 


8.00 


408031 


AA08139S 


Ha.42173 


Homo sapiens cDNA FU10366 fis. done NT 


9.91 


93.00 




GB086S48 


Hs.42346 


catdneurin-Undlng prolan calsajcin-l 


195.78 


231.00 


.408070 


AW1488S2 




gb3dQ5d05j(1 NCi_CGAP_Brn35 Homo sai^n 


1.00 




408101 


AWS68504 


Hs.123073 


CfX2-reIated protein Hhase 7 


37.84 


61.00 


408122 


AI432652 


Hs.42824 


hypaihe(icalprotdnFU10718 


0.85 


1.71 


408212 


AA37567 


Hs.43728 


hypolheScdprot^ 
inteiieuldnO 


5.88 


7.91 


408243 


Y00787 


Hs.624 


4.27 


9.S8 


408349 


BE546947 


Hs.44276 


homeoboxCIO 


3.79 


3.46 


408353 


BE43983S 


Hs.44^ 


mitoctiondnal r&osomal pioteirt S17 


1.88 


1.^ 


408354 


AI382603 


HS.1S9235 


ESTs 


1.00 


73.00 


408369 


R38438 


Hs.182575 


sohrte canter (amSy 15 (H777 transport 


1.41 


16.50 


408380 


AF123050 


Hs.44532 


diubiquiiin 


15.19 


3752 


408482 


NM-Q00876 


Hs.45743 


adenosine A2) receptor 






408522 


A1541214 


Hs.46320 


SmaO proGne-rich prot^ SPPK piuman. 


tg8 


l!24 


408536 


AW381S32 


H8.135188 


ESTs 


1.55 


1.50 


408545 


AW23540S 


HS.2S3690 


ESTs 


l.QO 


1.00 


408572 


AA055611 


Hs.226568 


ESTs. Moderately sindartD AUJ4_HUMAN A 


1.00 


44.00 


408633 


AWg63372 


Hs.46677 


PRO2000 protein 


107.16 


56.00 


408660 


AA525775 




ESTs. Moderately dmSar Id PC4259 lerri 


1.00 


1.00 


408761 


AA057264 


Hs238936 


ESTs, Wealdy similar to (dsffine not ava 


52.24 


141.00 


408771 


AW732S73 


HS.47S84 


potash voteg^gated channel, delayed 


3.(S 


109.00 



153 



20 



30 



40 



55 
60 



75 
80 



WO 02/086443 


40B783 
408790 


AF192522 
AW580227 


Hs.47701 
HS.47B60 


408805 


H69912 


H$>t82S9 


408841 


AW438865 


H8JZ56S62 


408873 


AL046017 


Hs.182278 


408908 
408932 


BE296227 
AAaS9325 


Hs^S0822 
Hs.71642 


40B9S6 


AI979168 


Hs.344(ra6 


40901S 


BE3a9387 


Hs.49767 


409041 
409077 


T97490 
AA4013S9 


K$.5aa02 
Hs.50081 
Hs.190721 


489093 


BE243834 


H5.S0441 


409103 


AFai237 


Hs.1 12208 


409142 


AL136877 


Hs.50758 


409187 


AF1S4830 


HS.S0S66 


409228 


A1654298 


Hs^1695 


409234 


A1879419 


HsJ7206 


409268 


AA625304 


Hs.187579 


409269 


AA576953 


H5.22972 


409361 
409404 
409420 


NMJ005982 
BE2200S3 
Z1S008 


H5£4416 
Ns.129056 
Hs^44S1 


409430 


821945 


Hs.346735 


409446 


A1561173 


Hs.67688 


409506 


NM_006153 


HS.S4589 




AA075382 






AM01369 


Hs.190721 




W74001 


Hs.55279 










AI769160 


Hs!l 08681 




AA1 25985 


Hs.56145 




AW6752S8 


Hs.56265 




NM-001898 
AW502152 


Hs!l23114 




AW247030 






AI337658 


Hs.1 56351 




AVV511413 


Hs]278025 




AW1 03364 






NM_001523 


Hs!57697 


410001 


ABa41036 


Hs.57771 


410032 


BEa8S985 






AB020725 


Hs.58009 


410044 


BES66742 


Hs.58169 


410048 


W76467 


Hs.58218 


410076 


T05387 


Hs.7991 


410102 


AW248508 
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Hs.30736 


KIAAai24 protein 


3.16 


1.92 


452838 


U65011 


H8.30743 


preterenBaSy expressed an^en in mela 


174.35 


1.00 


452862 


AA401389 


Hs,190721 


ESTs 


98.26 


17.00 


452865 


AW173720 


KS.34S805 


ESTs. tffealdy Mario A47582 B«dl gr 


1.55 


1.00 


452934 


AAS81322 


Hs.4213 


hypotheBcal pratdi M(3C16207 


1.73 


1.19 


452946 


XSS4K 


»K31(»2 


EphAS 
ESTs 


1.00 


1.00 


452976 


R44214 
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401212 
401411 
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401464 
401714 
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401780 
401781 
401785 
401797 
401961 
401985 
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723t»83 Minus 

8516137 Minus 
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172961-173056.173868-173928 



118596-118816.119119-119244t119609-119761.120422-120990.130161-130381.130468.130S93.131097-1312S8.131B66- 

131932. 1324S1-132S75.133580-134011 

83126-832S0,eS32M5S4a94719-95287 

28397-2861 7.28920-29045.291 35-292985941 1-29567,29705-29787.30224-30573 

83215-83435.83531-83656.8374(M3901.84237-fl4393.84955-85037.86290-«6814 

165776-165996.166189-166314,166408-166569.167112-167268.167387-167469.168634-168942 
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42904-43124.43211-43336.44607-44763.451994S281.46337-46732 
121907-122035,122804-122921,124019-124161.124455-124610.125672-126076 
113765-113910.115653-115765.118808-116940 
21059-21168 
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405449 
405568 
405572 
405646 
40567S 
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406467 9795551 Plus 



123S25-1 23713 

30487-31058 

7513-7673 

634480554 

182212-182958 



TABl£10A: Rrt8nlialTtierapeufc.Cfe8nosfc and Pioanosfc targets tor The^^ . ^, ^ ^ . 

TaUe 2A shows alwui 307 genes up^8gulaled In non-oialianail lung disease relaSve to lung tumors and roima! Iwdy tissues and/or down-regulated In fcmg tumors relasvo to 
nwmai king and nonnnalignant lung <Sse3sa. These genes were salertad from *out 59680 probesets on Bia Eos/Af^atrix Hu03 Genechip array. 

Table 108 show the accession numtjeis for those Pic^s lacHng Un^eneitrs fcr taiila 1 0A. For each probeset we have listed Ihe gene duster number ftom which fte 
oOgonudeoSdes were deslflned. Gene dusters were compiled using sequences derived Irom Genbank ESTs and lufWAs. These sequences wore dustared based on sequence 
■ " ityushgOusteriiq and AHgnmant Tools poubleTvnsl,OaWandCaritente 



simnarity using 
Vtoession'ca 



Trito IOC show Ihe genonfe positioning for those Pkey's lacldng Unigene ID'S and accession numbers in late 1QA. For each predicted axon, va have listed the gertomic 
aeqvenoBaouica used for predidbn. NudeoSdatocalionsofeadipredldsdexo ~ -* 



Unique Eos probeset identilier number 
Exemplar Accession number, Genbank accession number 
Unigene number 
Urigene Tide: Unigene gene Gtie 

R1: Average of lung hmors (tnducEng squamous call ca 

* average of normal hing samples 



R2: 



407568 AA740964 

408562 Ai436323 

409031 AA376836 

410434 AFD51152 

410467 AF102546 

410808 T40326 

412351 AL135960 

412372 RBS998 

413795 A1J040178 

414154 AW205314 

414214 049958 

414998 NM.002543 

415122 1360708 

415765 NM_00S424 

415775 H00747 

415910 U203S0 



H3.155376 
Hs.62899 
Ks.31141 
H&78728 



Ks.167793 
Ks.73828 
HS.2B5243 
HS.1420Q3 
Hs.323060 
Hs.75819 
Hs.77729 
Hs.22245 
Hs.78824 
Hs.29792 
Hs.78913 



U/dgenaTiOa 

ENSPaOOa024107&TRRAP PROTEIf{. 

Target Exon 

TatgdExon 

hemoglolAii tela 

ESTs 

Homo sapiens fflRNAibrKIAA1568 protein, 
ESTs 

lcfrGkBieG8ptor2 

dachshtnd pkosofMaJ homatag 

ESTs 

T-ceB acuta lymphocytic taukemia 1 
hypolheticsj protein nj22029 
ESTs 
ESTs 

glycaproteinMSA 

oxidised kiw dens!^ npoproteln (lectin 
ESTs 

tyrosine kinase will) immunoalobulin and 
ESTs. IWeakly ^milar to 138022 hypotheS 
dwmoklne (C-X3-C) receptor 1 



us and can^noM Iuiikxs) #rided by the 
Is, asthma} dvMed i)y the axetage of nonnai luno samples 
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416319 A1815601 Hs.79197 

416402 NMJDOOZIS Hs.1012 

417^ D13168 Hs.82002 

417421 AL138201 

417S11 AL049176 

418489 U76421 

418726 BE24iei2 

418741 H83265 

41S8B3 BE387036 

419086 NM.0<ia216 

419150 T29618 

419235 A\W470411 

419407 AW410377 

420556 AA278300 

420656 AA27g0ffi 



35 
40 

45 
50 
55 
60 
65 
70 
75 
80 
85 



Hs^120 
Hs.82223 
Hs.e5302 
Hs.87860 
Hs.8881 
Hs.1211 
HS.89S91 



C083 anSgen (acSvated B lyniphocytes. i 
complement component 4-Uflding proletn, 
endoflieSn receptor type 8 
nuclear receptor subfami^ 4, gioup A, m 
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421177 AW070211 

422080 R20893 

422426 W79117 

422652 AWS67969 

423099 NM_002837 

424*33 H04607 



Hs.288433 
Hs.41502 
Hs.124292 
Hs.187636 
HS.29082S 
Hs.102415 



Hs.se 



adenosine deaminase, RNA-spedfic. B1 (h 

protein tyrosine phosphatase, non-iecept 

ESTs. Weakly siniilar to S41044 chromosoffl 

acid phosphatase 5, tartrate lesislant 

Kallmann syndrome 1 sequence 

TEK tyrosine tiaase, endothelial (vaious 

neuTotrimin 

hypolheGcal protein FU21276 

Homo sapiens cDNA: FU23123 lis, done L 

ESTs 

ESTs 

Homo sapiens mRNA: cONA DKFZpS86^n121 (f 
ESTs, Moderately similar to AUJS_HUMAN A 
ESTs 



424711 N(yL00S795 

424973 X92521 

425023 AW9568a9 

425664 W006276 

425998 AU076629 

426657 NM.015865 

426753 T89832 

427SS8 D49493 

427983 M17706 

428467 AK002121 

428927 AA441837 

429496 AA453800 

430466 NM-004673 

431385 BE178S36 

431728 NHL007351 

431848 AI378857 

432128 AA127221 

432519 A1221311 

433043 WS7554 

433803 AI823593 

434730 AA644669 

435472 AW972330 

436532 AA721522 

437119 A1379921 

437140 AA312799 

437211 AA382207 

437960 AI659S86 

438202 AW169287 

438373 AI3a2471 

438875 AA827640 

441048 AA913488 

441188 AW292830 

441499 AW298235 

444513 AL120214 

444527 NM.005408 

444561 NA1.004469 

445279 841900 

446017 N98238 



Hs.123641 

HS.921B 

Hs.131987 

Hs.152175 

HS.1540S7 

Hs. 154210 

HS.1S9003 

Hs. 165950 

Hs.171731 

Hs. 170278 

Hs.2171 

Hs.2233 

HS.1B4465 

HS.902S0 

Hs.192793 

Hs.24t519 

Hs.11090 

Hs.268107 

HS.1267S8 

Hs.117037 

Hs.130704 

HS.12S019 
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calcitonin receptor-tike 
matiix metaDoproleinase 19 
endothelial differenUation. sphingolipi 
transient receptor potential channel 6 
libroblasl growth tactor receptor 4 
solute carrier family 14 (urea transport 
ESTs 

growth differentialion fector 10 
colony slimulajjng laclor 3 Igranuloeyle 
hypothetical protein aJ112S9 
ESTs 



Hs.177043 ESTs 



ESTs. Highly similarto AF175283 1 zinc 
ESTs 

ESTs. Weakly sMar to BCHUtA S-100 pro 
lymphoid nuclear protein (LAF-4) mRNA 
ESTs 
ESTs 

triggering receptor expressed on nqrelold 
8baiv54h12r1 Na_CGAP_Ew1 " 



446998 N99013 

447357 AI375S22 

448106 AIB00470 

448253 H25899 

449275 AW4S0a48 

450400 AI694722 

4S0698 AI654223 

450726 Awaweoo 

451497 H83294 

451533 NM_004657 

453636 R67837 

458332 A1000341 

459580 AA022888 
400269 
403421 

407570 Z19002 

412295 AW088826 

414517 M24461 

417204 NB1037 

418307 U70867 

418935 T28499 

421502 AF111856 

421798 N74880 



Hs.5509 

Hs.222194 

HsJ2588 

Hs.124292 

Ks.189059 

Hs. 192102 

Hs255509 

Hs.101689 

Hs.7117 

Hs.11383 

Hs.11392 

HS.2224S 

Hs.55185 

Hs.16714 

Hs.16762 



Hs.171941 
Hs.201591 
Hs.205457 
Hs.279744 
Hs.16026 
Hs.250505 
Hs.284122 
Hs.26530 
HS.169S72 
Hs.220491 
Hs.176065 



Hs.37098 

Hs.117176 

Hs.76305 

Hs.1074 

Hs.83974 

Ks.89485 

HS.10S039 

HS59877 



acdvalor of ta^M In testis 
ecolropic viral integration site 28 
ESTs 
ESTs 

Homo sapiens cDNA: FU23123 «s, done L 

ESTs 

ESTs 



glutamate receptor, lonobo^^c AMPA 1 
small Inducible cytokine subfamily A (Cy 
c-fos induced growth faeSor (vascular en 
ESTs 
ESTs 

Rho guaivne exdiange factor (GEF) 1 5 

Homo sapiens mRNA: cONA OKFZp564B2062 (f 



ESTs 

psitaidn 

ESTs 

hypothetical protein FU23191 

FBflnolc add receptor, alpha 

Wntlnhailtoiyfaotor-l 

serum depriraiion response (phosphaSc^ 

ESTS 



NM_016369*:Homo sapens daudln 18 (CLON 
Jane finger protein 145 (Kmppel-Ike, e 
poly(A)-blndlnfl protein, nudear 1 
surfactant, pulmonaiy-assodaled protein 
surftctant. putmonaiy-assacialed pndein 
solute carrier ^Dy 21 (prostaglandin 
carbonic anhydrasa IV 
solute carrier famfly 34 (sodium phospha 
H-acylspHngosine amidohydrolasa (add c 



142.00 
147.00 
141.00 



16:36 
167^ 
151.00 
153.00 
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10 
15 



30 
35 



55 
60 



75 



423354 


AB011130 


Hs.127436 


423738 


ABa02134 


HS.13219S 


425211 


M18667 


Hs.1867 


425438 


T62216 


Hs.270840 


426828 


NM_O00a2O 


Hs.172670 


427019 


AA001732 


Hs.173233 


428043 


T92248 


Hs.2240 


430280 


AA351258 


Hs.237868 


431433 


X6501B 


Hs.253495 


431723 


AW058350 


Hs.16762 


432985 


T92363 


Hs.178703 


441835 


AB036432 


Hs.184 


442275 


AW449467 


HS.5479S 


443709 


AI082692 


Hs.134662 


444325 


AW152618 


Hs.16757 


4S09S4 


A19D4740 


Hsi5691 


451558 


NM.001089 


Hs.a630 


453310 


X70697 


H5.553 


456855 


AF035528 


HS.153B63 


444342 


NMJ)14398 


Hs.10a87 


400754 






401045 






401083 






402474 
402808 






403021 
403438 






403687 
403764 






404277 






404288 
404518 


A181S601 




405106 
405381 
408387 






406646 


M33600 




408714 


AI219304 


Hs^66959 


406753 


AAS05665 


Hs.217493 


406973 


M349SS 


Ks.198253 


407246 


U82275 


Ks.94498 


407510 


U98191 




407731 


NlyL000068 


HSJ8069 


407830 


NlyLOOIOBB 


Hs^7 


408045 


AW1389S9 


Hs.245123 


408074 


R20723 




408374 


AW025430 


KS.15S591 


409064 


AA062954 


Hs.141883 




AF050083 


Hs.673 


409153 


W037S4 


HS.S0813 




AA78Q473 


H5£87 


409038 


AL049990 


HS.51S1S 






Hs.301281 




086640 


HS.S6045 




BE178622 


Hs.162gi 


411020 


NM_005770 


Hs.e7726 


411667 


BE160igS 




412000 


AW576555 


HS.157B0 


41235B 


88)47490 


Hs^4172 


412420 


AL03S66B 


HS.73B53 


412S64 


XB3703 


HsJ1432 


412869 


AA2S0712 


Hs.82407 


412870 


N22788 


Hs.82407 


413529 


U11874 




413533 


BE146973 




413689 


eE157288 


Hs^631 


413724 


AA131466 


HsJ!3767 


413800 


AI129238 


Ks.192235 


413802 


AWS844g0 


NSJ2241 


413829 


NMJI101872 


Hs.75572 


414376 


Hs.66915 


414577 


>U0S6S48 


Hs.72116 


414700 


H63202 


HSJ8163 


415078 


AA311223 


Hs.283091 


415120 


N64464 


Hs.34950 


415323 


8E26g352 


Hs349 


41533S 


AA8477S8 


Hs.111030 


41SSB2 


W9244S 


N8.165195 


416030 


H15261 


KS21948 


416427 


BE244050 


Hb.79307 


416464 


NM.000132 


HS.7934S 


416585 


' X54162 


Hs.79386 


416847 


L43821 


Hs^261 


417148 


AA3S9898 


HS.29388S 


417370 


T28651 


Hs.82030 


417673 


T87261 


H5.1635S 



PCT/US02/12476 



ESTs 

actwin A receptor type ll-H» 1 
hypothetical prol^ FU10970 



Interieuldn 7 receptor 

surfactant, pulmonaiy-assaciated protein 

Homo sapiens niRNA: dMA OKFZpS6482062 (f 

ESTs 

advanced glycosylaEon end praduetqied 

ESTs 

ESTs 

ESTs 

receptor (calcitonin) acSvily modi^g 
ATP-Mnding cassette, sul»4amily A (ABC1 
solute canier teiT)2y 6 (newotransmlUe 
MAD (molheis against decapentaplegia Dr 
similar lo lysosome-assodated memljrane 
TaigetEiron 

C11001883*:9i|6753278treflNP_033938.1| c 
NM_016582':Hon» sapiens peptide franspor 
NhL004079:Komo sapiens caOiepsin S (CTSS 
ENSP00000235229:SEMa 
C21000030:gi|99559501r^NP_0639S7.1| AT 
NM_031419**tonio sapiens moiecute possess 
NI00703rHon» sapiens a disintegrin4 
NM_00S4S311anio Syrians heterogeneous nuc 
NM_01911 1'Homo sajiens majOT tilstocompa 
NM_002944*:Homo sapiens v-ros arfan UFi2 
C083 antigen (activated B lympliocytes, i 
CI 1001637*:gil5032241|nB(lNP_005732,l| z 
Target Exon 
Target Exon 

m^r liistocompatBjai^ complex, class 
hento^obin, ganuna G 
annexinA2 

major hbtocompafatiillly complex, class 



glxHuman trophoUasthypoxia^egulaied f 
complement component 8, bela polypeplide 
aiylacetamlde deacetylase (esterase) 



ilylVB,polypepl 
Homo sapiens mRNA; cDNA DKFZp564G1 1 2 (fr 
Homo s^llens mRNA. chromosonia 1 specific 
sue homidogy three (SH3) and cysteine ri 
gb:PVI3-HT0605-27020(M01-a02 HT0605 Homo 
macrophage receptor with collagenous sir 
gb:QV1-HT0413^02B04l»Jifl3 HHMIS Homo 
ATP-blnding cassette, sid)-Mly A (ABC1 
ESTs 

bone morphogenelb proisln 2 



CXCchemokinelgandie 
CXCchemoklnellgandie 
iilerteilMn 8 receiAv. 

gb:CM441T0222-011 19M1&«06 H110222 Homo 
zinc linger protein, tubbmOy 1A, S (Pe 
hypoth^eal protdn FU12666 
ESTs 

ESTs, WbaUy amiiar to SBSSSl alpha-IC- 
caitxiiqfpqiBdase 62 (plasnia) 
ESTs, VtbaUydmRo^to 16.7Kd praMn ( 
hypotheScal protein aJ20992 sMIarto 
ESTs 

found in Infiammatanr zone 3 
ESTs 

neutrophil cytosollc factor 2 (6SlcD, ehr 
ESTs 

Homo saddens cONA aJ14237 lis. done NT 
ESTs 

Rac/Cdc42 guanine exchange bc(or(GeF) 
coagulation factor VIU, proooagulantca 
t^modin 1 (smooth musde) 
enhancer of filamentatlan t (casttB do 
hypotheacal proldn FIJ14902 
tiyptophanyMRNA synOidase 



67.00 
102.00 
70.00 
112.00 
10.17 



79.00 
27.35 
113.00 



111.00 
95.00 
87.00 



80.00 
85.00 
213.00 
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418067 A1127958 Hs.83393 

418296 C01566 

418643 J03798 

418832 X04011 

418945 BE246762 

419261 X07876 

41SS84 UQ8989 

419574 AK001989 

419968 X04430 

420256 U84722 

420285 AA25ai24 

420577 AA278436 

421262 AA286746 

421445 AA9130Sg 

421470 R27496 

421478 AI6a3243 

421563 KA«_006433 Hs.10580S 

421566 Nk/LOOOSgS Hs.13g5 

421855 F06S04 

421913 AI93436S 

421952 AA300900 



25 
30 
35 
40 
45 
50 



75 



85 



Hs.86671 
HS.B6948 
Hs.88974 
Hs.89499 
Hs.89791 
Hs.91139 
HS.9116S 
Hs.93913 
H5.762Q6 



Hs.1 86643 

Hs.g343 

Hs.104433 

Hs.1378 

HS.972S8 



42238S AF10S374 

423168 fWm 

423196 AKD01866 

423387 AJ012074 

423424 AF150241 

423456 AL110151 

423698 Z92S46 

424027 AW337S75 

424212 NMJ005814 

425087 R62424 

425175 M=Q20202 

425771 BE5617T6 

426486 BE1782a5 

427507 AF240467 

427618 NM_000760 

427732 NM_002980 

427952 M765368 

428709 BE268717 

428769 AWa)7175 

428780 A1478S78 

428833 AIS28355 

4^57 D13626 

430212 AA469153 

430226 eE245552 

430376 AW292053 

430414 AW36566S 

430856 AA482900 

430843 AI734149 



Hs.27384 
Hs.109439 
Hs.98849 
Hs.1 13274 
Hs.115830 
Ks. 124940 
HS.12S139 



Hs^01591 
Hs.143131 
HS.1260S9 



HS.1S9494 

Hs.170056 

HS.1791S2 

Hs^175 

Hs.2199 

Hs.293g41 

Hs.104916 

HS.10S771 

Hs^636 



cystalinE/M 
ESTa 

small nuclear ribonudeopralein D1 polyp 
cytochrome b-245, beta polypepBde (dvo 
arachldonale S-lipoxygenase 
winglass-type MM1V Inlegrallon site fairi 
solute canier family 1 (neuiond/epliha 
hypothelica) protdn 
interteuWn 6 pnterferon, beta 2) 
cadherln 5, type 2, VE-cadherin (vaasila 
ESTs. Moderately sirnnar Id ZN91JIUMAN Z 
ESTs 

Homo sapiens cONA aj14265 tis. done PL 
Homo sapiens, done IMAG&40S486B, mRNA 
annextnA3 

ESTs, Moderately similar to S29539 rtbos 
gramdysin 

early growth response 2 (Kn»(-20 (Orosop 

ESTs. Moderately similar to AUJ4_HUMAN A 

osteoglycin (osteoinductve factor, mime 

ESTa, Moderate similar to AF16151 1 1 H 

transcripfionliaclorEC 

heparan sulfate (glucosamine) 3-Osulfot 

GTP-UndIng protein 

hypoUieOcal protein FU1 1004 

vasoactive intestinal peptide rei^pbr 1 

prostaglandin D2 synthase. hematopofeBc 

DKFZP5B5D0824 protein 

Sushi doman (SCR reps^ containing 

ESTs 

I A33 (transmembrane) 



UNC13(C.eleoansH1kB 
Bniton dQaminaglobuSneihl^ . . . . 
Homo saitos mRNA; cONA DKFZp58680220 (f 
toO-Uiia receptor 7 

colony sSmutaOng factor 3 receptor (gr 
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ESTs, Moderately dndar to A539Sg flvorn 
liypolbeBcal protein FU21940 

ESTs 



HS.246S HAAOOOIganepreductputalBve&ptQtd 

glKnc67fD4^1 Na_CGAP_Pi1 Han» s^ns 

Hs.2551 adreneigte,bet»-2-.iB0Bpbr,si"* — 

Hs.i:e32 ehramos *" 

Hs.12a3B8 ESTa 

Ks.162080 ESTs 

Hs.119514 ESTs 



431217 NM_013427 

431921 N46466 

432176 AWD90386 

432203 AA305746 

432231 AA339977 

432485 mms 

432522 011466 

432596 AJ224741 



HS.2S0B30 RhoGTPaseaeBvaGng protein 6 
Hs.58879 
Hs,1 12278 
Hs.49 
Hs.274127 
Hs.276770 
Hs.51 
H&278461 
Hs.3110 



oigioteflsiii receptor 2 




macrophage scavenger receptor 1 
OST 11240 protein 
C0WS2 antigen (CAMPATH-1 ant^en) 
ptMsphaSdyrmo^ glycan, dass A (pa 



433588 A1056672 

434445 AI349306 

435498 AW840171 

435974 U29690 

436061 AI248S84 

437157 BE048860 

437207 T27503 

437311 AA370041 

437439 829796 

438199 AW016531 

439551 W72062 

440515 AJ131245 

440887 AI799488 

441025 AA913880 

441384 AA447849 

441735 Af738675 

442200 AW590572 

442B32 AVW06560 

442957 A194g952 

443282 T47764 

443547 AW271273 

443951 F13272 



WeaMy slnte to transformation-r 
Hs J7744 Homo sapiens bela-1 adrenergic receptor 
Hs.1«I745 Horoos^denscDNA:FU21326fls,doneC 
Hs.12a6S5 ESTs 

HS.1S929 hypattMllcslpRiMiFU12910 



Hs.122147 ESTs 

Hs.11112 ESTs 

Hs.7239 SEC24(&cetBvlsiaB) related gene Earns 

H3.135905 ESTs 

Hs.176379 ESTs 

HS.28K60 Homo saTKBase{MA:nJ22182 lis, done H 

Hs.127346 ESTs 



Hs.49397 ESTs 

H5.132917 ESTs 

Hs.23767 hypotheScal protein FU12666 

Hs.1 11334 feni&(.IIghtpolniepade 

Hs.49265 ESTs 



S9.00 

60.00 

14.74 

3.16 

73.00 

19Z0O 

94.00 

500.00 

1.70 

172J10 

97.00 

64.00 



31.67 
129.(K) 
101.00 
63.60 
148.00 



98.00 

iiaoo 

52.00 
132.00 
15.60 
103.00 



1.84 

128.00 

108.00 

91.00 

87.00 

105.00 

71.00 

115.00 

80.00 

3.10 



lass 

70.00 
197.00 
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444515 AW2M908 

445769 AI7414ri 

445308 R135B0 

446291 BE397753 

446917 AI347863 

447261 NKL006691 
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447482 AB0330Sg 

447997 H006S6 

448299 AA497044 

448782 A10S0295 

450575 NM_005859 

450584 AA04O4O3 



451103 (€2804 

451220 AF124251 

451668 Z43948 

452197 AW023595 

452331 AA598509 

452353 a8825 



453107 NM.016113 



453390 AA862496 

453531 AA417g40 

454741 BE15«96 

456579 AA287827 

456672 AK002016 

457400 AF032906 

457718 F18S72 



H5.1S6672 
Hs.17917 
lto.301957 
Hs.18705 
Hs.29792 
HS.208B7 
Hs,22039 
Hs.29117 
Hs.60371 



ESTs 
ESTs 

Homo saplans done 24425 mRNA sequence 
biterferon. gaiimi»hdiKible prolan 30 
ESTs 



H5.31S70 
HS.259S8 
Hs.26054 
lte.326444 
Hs.232048 
Hs.29117 
Hs.29191 
Ks.30343 
Hs.279746 
Hs.31412 
Hs.28482 



K{AA1233pniteh 

ESTs, WeaHy sitnlter to I38Q22 hypoBieU 
hypoDielical protein FU10392 
KIAA0758 protein 

punne.rich elenwnlbindlng protein A 

ESTs 

ESTs 

ESTs, VfeaMy staritar lo KIAA1324 preteln 



novel SH2-contdnbig protein 3 
catlage addle pratehl 

ESTs 

puriimtch atementUnding prateta A 



vanlltold reGeptor-DcBpro^ 1 

Homo sapiens cDNA HJ1 1 422 fis, done HE 

ESTs 

ESTs. W^aMy sbiiar to JC57g5 COEP prot 

glKCM2-Hr0342^)91 29*050405 HT0342 Homo 

ip.regulated by BCG-CWS 

Homo sapiens, done MGC:16327, mRNA, com 

caihepslnZ 

ESTs. Weakly similar b> AUJ4J1UMAN ALU 8 
)^SC1KA072 nonnallKd MSmt brain cON 



106.00 
47.20 
100.00 



11.33 
94.00 
91.00 
152.00 
86.00 



13Z00 
7Z0O 

em 



Pkay. Unique Bos piobesel idenlllier nu 
CAT number. Gene cluslar number 
Accession: Genbank accession numbers 



1375344_1 
22779_1 



430212 314437_1 

436532 421602.1 

453531 97026J 

454741 1232559.1 



R20723 AA263aa3 AA333976 AA334725 AA3341S1 AW965490 AA310513 AI810530 031302 AW134897 AAS30127 AA046953 AI66ffl30 
006094 AW104S34 

BE160ia8AW935898T11S20AW93S930AWffi6073AW861034 ^ _ 

BE148973 BE146972 BE147042 BE147018 BE1467a3 BE147020 BE146781 BE147019 BE146766 8E147021 BE146952 BE146767 BE147044 

BE146797 BE146776 BE146985 BE146793 BE14676a BE146771 BE146954BE146760 BE147048 BE147025 BE147030 

AJ012074 U1 1087 L13288 X75299 L202ffi AW530780 H14880 T28037 AI872991 R72136 AW449839 T81622 T79897 T29519 R94105 T83923 

R73300 AI797007 R73390 AA961010 H74168 AI689932 BE045543 AI808418 AI6C8912 AI806573 AW884084 AW87a7B AW87298S AA56565S 

AI022915 R50847 R73210 H4S098 R46451AW166269ni132AI264S47R52146AI304920 R73391AW884059AVV884085 H7324ira 

T79612 R73145 RS0549 AI094ffi7 AI668793 R72302 AI564366 W01956AA418962 W32571 R72840 H45409 R72085 R46356 R467S8 

AA5Q880S AM1B798T837S1 Rg4072T1&182AA928785AAg03896 

Z92546 AA330586 AIS70SG8 AW341487 AI8270SO AW29B66B AI792ia9 AI01S693 AI733S99 AI5722S1 AI672468 AW193262 A1244716 

A1864375A12Q6100AAB12444AI26g365AI6402S4AW772466A1867336AA627604H16914AA3S8477AA338009 

AA4691 S3 AI718503 AA4892:S 

AA721 522 AW976443 T93070 

AA417940AA038735T0702S 

BE154396 AW817959 BE154393 



Refc Sequence source. The 7 digH numbeis In iMs ootumn are Genbank MenSfler (61) numbers. 

sequence of humaichrofflosome 21' Dunham I. etd,ltelure (1999) 402:489435. 
SCtand: Indicates DMA strand from wMchexons were pred^ 
NtjMsilion: Indicates nucteofideposiGons of pteifictsdexatts. 



■DiBdiam L el aL* refers to lha pabEc^ enfifled The ONA 



402474 7547175 

402808 6456148 

403021 7547270 

403421 9865041 

403438 9719579 

403687 7387384 

403764 7717105 

404277 1834458 

404288 2769644 

404394 3135305 

404518 8151988 

404916 7341826 

405106 8079395 

405257 7329310 



114964-115136,115461-115585.118931-116047.117666-117771.118004-118102 
120799-123966 

126603-1»773.1399a8-140205 



37121-37205.37491-37762.41053-41140.41322-41593.4177341919 

84494-84603 

91057-91188 

80877-81418 

73121-73273 
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TABLEIIA: Genes DisfinguisMngAdenocaidnoniattam) Other Img Disease 

Table IIAshoM about 84 genes upiegulated in lifflgademxanuMi^ These genes wens satected 

ftom about 59680 probesets on the Eos/M^Mx Hu(» GenecMp amy. 

Table 1 1 B show the accesston numbets lor those PIm/s lacUng UntgeneUTs tbrlable 1 1A. For each prabeset we haw Osted the gene duster number bom which the 
dlgonudBolides were designed. Gene dusters were compDed ushg sequences derived flntn Genbank ESTs and mRNAs. These sequences vrerediBtered bas^ on sojuence 
simaarity using austsring and ABgnmentTods {pouMeTntel Odiand CaMbtrta). The Genbai* anes^ numbers brsequences comprising eadi duster are Bsted hi the 
'Accession' cduira). 

Table 1 1C show the genondc positfoning far those Plcey^ l3d<ii« Unigene ID'S and accession amitm ta laUe 11A. Fbf eadi pradlded axon, we haw listed Vie genondc 
sequence source used tor prediG&on. NucteoldelacafiansofeadiprediciBdexanaadsolisied. 



PkBf. Unique Eos prabeset Identifier number 

ExAccn: Exen^ Accession nuniber. Genbanli accession number 

Un^fflielO: Unigene number 



Unigene Title: Unigene gene fifle 








Ptey 


average of nonnal lung samples 








ExAccn UniganelD 


Unigene TiUe 














TaigetExon 

NM_003122^Honio sapiens serine protease 


100 


39^00 










canjnoembyonic anttgen^ated ceB ad 










A1827976 




hypoth8&caIpfot^FU13512 










A1AI072003 


Hs.2S(}822 


heparan sulfate [glucosamine} 3-O«ulf0t 
serine/threonine kinase 15 


1.00 










Ks.1 12208 


XAGE-1 protein 










Ar154830 




carbanioy|.phosphale synthetase 1, mHoch 








AAS7G953 




hypothetical praUtrt FU13^ 














ESTs 










AW24S50B 




Homo sapiens cONA I^J14035 lis, dons HE 










6E068889 




synud^, gamma (breast cancer-specific 














cytldina deaminase 




\.ao 






N(lL00u047 




aiylsuK^e E (chondrodysplasia puncta 














amBorlde binding pmtdn 1 (antine oxida 














ESTs, We^dn«artoMU(SJfl]MAN MUON 
























cytodiRime P450, eubbnd^ XXIV (iriMn 










AU076704 




fibrinogen, A dfdia potype{jide 








AW188117 


Ks«303t54 


popeye pratein 3 










AF044197 




small Inducible cytokine B subMy (Cy 












Hs. 102267 


lysyloxklase 












H8.1 02482 


mu^ 5, siditype B, bacheobronchid 














solute carrier famfiy 1 (glutamate bans 












Hs.105352 


GalNAc alpha-2. &«i^trans{eras8 1, 1 










m91 11279 




trefon factor 1 (breast cancer, esboge 




1.00 






AI6SB872 


Hs.zaZoU4 


trimideottde repeat contairnng 9 
h^eScal protdn FLJ22704 










AF073515 




cytddne receptar.Bka fadot 1 














cartaage digomeric malrte protein (pse 














breast caidnomaampElial sequence 1 








M90516 


Hs!l674 


glutamIne-fhictose*i*osphatBlransamin 










AF242388 


HS.149S85 


langsin 










M88700 


Hs.150403 


dopa decarboxylase {aTomaSe L-amtao act 










NM-002497 


Hs.153704 








424960 


BE24538Q 


Hs.153gS2 


5'nudealldase(C07^ 




too 




425523 


AB007948 


HS.1SS244 


KIAA0479 protein 


i.ro 






426230 


AA357019 


Hs.241395 


praleasa, serine, 1 (bypsin 1) 


1.00 


83i0O 
34 JM) 




427701 


AA411101 


Hs.243888 




7.41 




428S8S 


AB007863 


Hs.185140 


K1AA0403 protein 


1.00 


aoo 




4287SB 


AA43398B 


HSJ98902 


hypoBisEcal protein aJ14303 


1.06 


1.13 




429170 


NM.001394 
AA0ig004 


KS.23S9 


dualspecBkil1ypbosphalase4 


16.18 


105.00 




42S263 


HS.19839S 


ATP4Aidliia caEsetle. subteiBy A(ABC1 


1JJ7 


\M 




«9610 


AB024937 


Hs.211092 


lUNX ptolatn: aUNC (palate lung and nas 


1.59 


1.69 




430508 


AI015435 


Hs.104637 


ESTs 


4.75 


7.27 




430985 


AA490232 


Hs.27323 


ESTs, Weddy sMIar to 178885 serine/Ih 


0.94 


1,28 




431548 


AI834273 


Hs.9711 


novd protein 


5.66 


15X)0 




431S66 


AF176012 


HS76072Q 


J doniain conl^ng protstn 1 


49.76 


37.00 




431986 


AA536130 


Hs.1490ia 


Novd human gene mapping to chomosome 20 


1.19 


1.47 




432375 


BES36069 


Hs.2962 


S100 calcium.blnding pnstein P 


1.65 


1.06 




432677 


NM_004482 


Hs.278611 


UDP-Nnacetyl-alpha^alactosandrwpolyp 


1.00 


48.00 
19J>0 




433556 


W56321 


Hs.1 11460 


cddum/catmodulbKlepandent protein kin 


1.00 




433819 


AW511097 


Hs.1 12765 


ESTs 


3.71 


&00 




434001 


AWgS0905 


Hs.3697 


serine (or cysteine) protenase hhibito 


29.31 


72.00 




434424 


AI811202 


Hs.325335 


Homo sapiens dOHAi HJ23623 lis. done L 


1.00 


64.00 




434792 


AA649253 


HS.13245B 


ESTs 


8.52 


44.00 




438217 


TS3925 


Hs.107 


fitatnogen^awl 


57.97 


31.00 




436749 


AA5S4890 


Hs5302 


ledin, gdad(^de4)indlng. sduUe, 4 


1.10 


1.41 




436972 


AA234679 


Hs.25640 


dauifinS 


1.59 


1.46 




437866 


AA156781 




mstaUolhioneIn IE (Tuncilonal} 


3.ez 


101.00 




437935 


AW939591 


Hs.5940 


mudn 13, epiiheBd transmentbiane 


1^ 


1.39 
1.00 




438915 


AA280174 


Ks.285681 


Wniiams-BeuTen syndrome diromosanie 


1.00 
23.28 




439451 


ARffi6270 


Hs.278554 


heterochramatIn4ike piotein 1 


5100 
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439753 AimOSS Hs.67709 Homo sapiens mRNAfofllsnglh insert cDN 1.00 2\M 

441031 AI110684 Hs.7645 fibrinogen, B beta (xriypepCde 1.41 93M 

441377 BE218239 H5.202656 ESTs 22X13 1.00 

443614 AV655386 Hs.7645 fibrinogen. B beta polypepSde 1.00 16D0 

443813 AA876372 Hs.93961 Homo sapiens mRMA;eONAOKFZ|)667D0« (ft 1.20 1.99 

443991 NIUI_0022SO Hs.10082 potassium Intennediaiefein^ooRiJuctanoe 5.71 6.67 

444670 HS8373 Hs.332g38 hypolheBcai proteta MGC5370 1.98 38.00 

444931 AV652066 Hs.75113 general transcriplian factor BIA 1.00 54.00 

446102 AW168067 Hs.317694 ESTs 1.00 1.00 

446163 AA026Bao Hs.25252 Homo s^enscONAFU13603fis, done PL 1.00 36J)0 

446469 BE094848 Hs.15113 homogentisate U^ioxygenaseOiomogentt 1.00 11X)0 

447388 AW630534 Hs.76277 Homo sapiens, done MGC:9381.i7iRNA,coflip 1.24 1.16 

447532 AK000614 Hs.18791 hypottielical protein FU20607 1-23 1.63 

448243 AW369771 Hs.52620 integrin, beta 8 15.84 1.00 

448844 A1SB1519 Hs.177164 ESTs 1.00 31.00 

449444 AW818436 HS.23S90 sdutecaiferfaniily 16(|(noaoca4oxy!ic 1.00 83.00 

4S1807 W52854 hypolhefical protein RJ23a3sWlaf to 1.55 35.00 

452689 F33868 Hs.284176 trans^ 1.54 1.44 

453392 U23762 Hs.32964 SRY (sex (felaimining legion YHkw It 1.00 16X}0 

453464 Aia84911 HsJ29a9 ieoeptor(eaidloidn)acfi«itynndl^ 1.SS Z45 

453735 AI066629 Hs.125073 ESTs 1.01 t.30 



410399 11995 1 BE0688898E068882AF044311 AF017256NI»L003087AP037207AF010126AA633976A/^2836BE298825BE299889A1016464A1684600 

AI936527 AA80467S AA394097 AI139933 AA946606 BE171 31 3 AA722407 AA293803 AM68480 AA056035 AAfl55968 AW796957 AI637713 
AA410737 H49348 AA488472 AA411094AA23SS94AA402624AA443638AW4S2137 AA4217D8AW26S211 AI4g32E6 AA36S132AWS66a44 

419502 18535 1 AU076704T748S4n4a60T72098 T73a65T73a73T69180 T7466BT68786TB0385T73410 T68781T6784ST87593 T73952T67864 T60630 

T68367T«8401T53959T72360 T72099TB0377T589filT71712T72821T64738T74645T72037 T68688 n2063 T73258T728» 
T68220 n4673 T71B00 T683S5T61227 T62738 T69317 T53850 TB4692 T73768 T73962 T733B2 T68914 T70975 T73400 T60631 T73277 
T73203 T70498 TBI 409 T5B925 NM_00050a M64M2 TS8301 T73729 Te944S T60424 T67922 T67736 T68716 T67755 T74765 T73919 T58719 
T74756 T60477 T74863 TB1 109 T683a T58850 T71857 T7342S T53736 T6B607 TS8898 T64S19 T72031 T72079 T64305 T71908 T68107 
T71 916 T73787 TS603S T6442S T7ia70 T60476 T61376 T67820 T71895 T41006 T69441 T68170 T7461 7 T71958 T69440 T61875 R06796 
H48353 Tri914 TS3939 T64121 AA693996 T72525 T67779 T68078 AA01 1465 AA345378 AV654847 AV654272 AV656001 A1084740 T82897 
N33594 AA344542 AW805054 AI207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092T38959 AI017721 AA31239S 
AA312S19T401S6H66239AV6S2g89H38728Rg8S21 AV65S200 FigS790 W032S0 W00913AA344136 AV660126 R97923 AA343596 
AW470774AVB512S6NS4417AA812882AW182929At1fl192H61463 H72060AA344503 H3B639AI277S11AV661108AI20762ST47810 
AA235a52T27BS3T47778 R95746 H70620 AA701463 AW827166 R98475 C20925 AV657287 T71959 T71313 T73920 T73333 T61618 T69293 
T692B3 T73931 T72178 T72456 AV64S639 AV653478 T72957 T72300 T5B9C6 T71 457 T70494 T72956 T70495 T68267 T74407 T85778 
AA344726T27854 T74485T74101 T73868 T71S18T7Z304AA343853 T73909T68070 T72065 H72149 T73493 n3495AV64S993 R02293 
T7047S T647S1 AA344441 AA343657 AA34S732 AA344328 AI1 10639 AA344603 AFQ63513 T&4698 T68S16 T72223 T60507 T67633 R29500 
T72S17 R02292 T60599 T69206 T70452 T74677 R29366 T61277 T74914 T60352 R29675 T74a43 AV645792 AA344408 T69197 T720S7 
TSSaSB T693SB T68258 AV650429 T73341 T61702 T74S98 T40095 K02272 T4010S AA34304S AA341908 AA341907 AA342807 AA341964 
T53747 T72042 T82764A1064899AA3430K)T67832 T72440 T71770T68091 T69108 T72449 T69167 T71289T68251 AV654844 T6437S 
AA34S234 T67S98 AA01 1414 T68036 H48262 AI2075S7 T68219 W86031 T69081 T64232 1^3196 T6213B AV650S39 H67459 T72978 
AA344^ T60362 H58121 T95711 T72803 T68055 T7171S R29036 T72793 T69122 T64595 TB28B8 T69139 T68291 T64652 T67971 T46B62 
AA^S92 A1248S02 R2B454 T647B4 T57001 T73052 T71429 TS1176 TS8866 AVB55414 H90426 AA342489 T73666 T67848 T72512 TS383S 
T67837 T7331 7 T74273 T69420 T68245 T74380 T67B62 T74474 T560K 

421582 2041 1 AI910275X00474X52003X(S030NMJM322SAA314326AA308400AA506787AA314825 AI571948AA50759SAA514579AA587613FB3818 

AA568312 AA614409 AA307S78 A)g2S5S2 AW9501S5 AI910083 IVI1207S BE0740S2 AW004668 AAS78874 AA5B2084 BE0740S3 BE074126 
BED741 40 AAS14776 AA588034 BE074051 BE074068 AWD09769 AW050690 AA858276 R55389 A1001051 AW05070O AW7a)216 AA614539 
BE074045 A1307407 AW602303 BE073575 A1202532 AAS24242 AI970839 AI909751 BE076078 A1909749 R55292 

43^66 44433 2 AA156781 AW293839US2054AA024g63AA778446K073977AW444904AW602S74BE164040BE164012BE163972BE163974BE163S92 

AA83748t AW468444 BEtS5091 AVM6B0OZ AA687333 AAB1 1B30 AA581806 AI8666B6 AI572124 AA043777 AAO409a6 020160 AI53S733 
AAS124a9 AW874142 AI471883 W84421 AA156K0 

451807 8865 1 1Ae2854AL1176aOK20B116BE208432BE208239BEOB2291 AW953423AA351619BE18QB4BBE14056aW60Q80AM^ 

AW4S06S2AW449519AM93S34 A1806539AA3516ia AVM49522AI827628 AA9047B8AA380381 AA886045 AA7744O98E003229 Z41758 



Ui^ue number oorrespondlng to an Eos prebeset 

Seqitencesouice. The 7 digit numbers in Silsc(AimnaiaGenbadcldentSer(a}nund)eis. "Dunham I. etd.' refers to me pufaBcaiionentiiledDieDNA 
se(|uenc80f>iumandvo(7iosoine22' Dunham L el aL.Nalui8(tS9S!) 402:489.495. 
bKficalesONA strand from wlitcti ex 
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TABLE 12A: Genes Dislinguishing Squamous Ce!! Carcinoma from Other ljung Diseases and NormaJ Uing 

Table 12A shows about 72 genes upsgiiated in squamous cell carcinomas of the lui« fetafive to other hmg tumois. non-ma!ignant lung disease, and nornial lung. These genes 
were selected bam about 59680 probesets on the Eos/Af^mdrix Ku03 Genechip array. 

Table 12B show 9te accession numbers for those Pke/s lacking UnigenelD's tor table 12A. For each probeset we have Osled «ie gene duster number from which Bie 
oBgonudeofides were designed. Gene dusters were compiled using sequences derived liwn Genbai* ESTs and mRNAs. These sequences wore ctuslered based on sequence 
simiariiy using Qusteriig and ADgnment Tools (DoubteTwtst. Oakland California). The Genbank accession numbers for sequences comprising eadi cluster are listed in the 



Table 12C show the genomic positioning for those Pkey"s lacking Unigene ID'S and accession numbers In table 1 2A. For each predicted exon. m have listed the genoinc 
sequence souice used for prediction. Nucleotide k)cations <^each predh:ted exon are also Isted. 



Un^neTUe: Unigene gene lifle 
R1: Average of hmg tumors finduding squamous cell can»wmas,adenocardnamas, 

average of nonnnal lung samples 
R2: Average of non-maHgnant lung disease samples (indudlngbroncWfis, I 



410561 
415091 
416817 
416656 
417034 
417366 
418663 
418678 
419121 



small cell caidnomas, granulomatous and carcinoid bmiois) dhrided by the 
fibrosis, atelectasis, aslhmai) dhrlded by the average of nonna] iuig samples 



401780 
401781 
4017BS 



AA045144 

L11690 

AIS412t4 



UnlgenelD Unigene Tilla 

Hs.2258 nati1xmsta8oprote!nase10(stFame!ysin 

NM_00242S:Komo sapiens matrbc metaltopro 
NM_005557^Homo sapiens tera£n 1 6 [foca 
Target Exon 

NM_00227S*:Honia sapiens ksrafin IS (KRT1 
Target Exon 

ENSPa00002S10S6*Plasma membrane caldum 
Target Exon 
Hs.181566 ESTs 

Hs.620 bi^us pemphlgdd anfigen 1 (230a40kD) 
Hs.46320 Small prdine^pretelnSPRK [human, 
Hs.6394 Homo sapiens ct)NA:FU22044fls. done H 
Hs.77910 3^rai9-3^ethylgtuteiyM:oenzymeAsy 
Hs.78867 pralelntynjslno phosphatase rBceplor4 




NICE-1 

protease inhibitor 3, skin-derived (SKAL 
aldo^alo reductase My 1, member BIO 
heparin-binding growth factor Idnding pr 
hypothetical protein LOC57822 
^ay tiypsin-like protease 
bimor protein 63 kOa with strong homolog 
serine (orcystnne) proteinase InhUdlo 
small pioGne-ridi protein 3 
Homo sapiens cDNA F1J10570 fis, done NT 
desmogleln 3 (pemph^us vulgaris antigen 
odd Ozyien-m homolog 2 (Drosophila, mous 
GanfigenTB 

ESTs. Weakly slmiiar to GGCIJtUMAN G ANT 
ESTs, WeaWy similar to 201 72(BA dhydro 
Ksp37 protein 

ESTs, Highly simiar to S60712 band«pr 
small praline-ddi protein 2A 
cyc&vdependent kinase 5. regulatoiy sa 



lymphacyle antigen 6 complex, tocus D 
ESTs 

cytDchnum P450, subfandly tVF, polypept 
lntorieiddn-1 homdog 1 
KlAA1313pral^ 
ESTs 
ESTs 

hypo9»fical protein FL)20093 
G protein<aupled receptor 87 
ESTs. Weddy slmlar to AC004858 3 U1 sm 
ESTs, VttosUy diriltarto OAPIJtUMAN DEATH 



137 J2 
56.19 
33.45 



38.00 
4Z00 
14.00 
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446292 AF081497 HsJ279682 RilypeColycopretdn 1S5 1.26 

447078 AWa85727 Hs.9914 ESTs 47.24 24.00 

447342 AI19926a H5.19322 Hsmosa|)leiis.SinBarloRlKENcONA2010 28.63 1.00 

449003 X76342 Hs.389 alcahol dehydrogenase 7 (dass IV). nuo 100 1.00 

449101 AA205847 Hs.23016 6prot»iKX)upledfeceplar 258 27.00 

450832 AW970602 Hs.105421 ESTs 25.17 36.00 

452240 A1591147 Hs.61232 ESTs 13.42 1.00 

453317 NM.002277 Hs.41696 keratin, hair. acid1c.1 1.19 1.27 

453830 AA534296 HsJ0953 ESTs 2452 25.00 

4S4098 W27953 Hs.292911 ESTs, KigMy similar to S60712band^ 1.26 1.11 

455601 A13686B0 HsJ16 SRY (sex ddannlning region Y)-box 2 206.11 1.00 

TABL£12B 



CATNumbef 

47065 1 ALl339l6N791t3AF08610lM76721AVW50828AA3640l3AVWS5684Al346341AI867464NS4784AI6SS270AI421279AW^ 
AA77SSS2 N623S1 N592S3 AA626243 AI341407 BE175B39 AA45898B AI3S89ie AA457077 



Sequence source. Tlie 7 digil numbers in this column are Geiibanlildenfiliaf(GI)nunib8is. ■Dunham I. at al." refers to the [wbScaBonenfiOed The DMA 
sequence of human chromosonn 22.' Dunham I. et at.. Nature (1999) 402:48»4g5. 



NLpo^tton: WIeatesnucleoBdaposl&ms of predicted axons. 

pk^ Ref Strand NLposilion 

400566 8118496 Plus 17982-18115,2)297-20456 

401780 7249190 Minus 28397-28617,28920-29045,29135-29296.29411-29567,29705-29787,30224-30573 

401781 7249190 Minus 8321543435.83531-83656,83740-83901,84237^4393.84955^5037.8629086814 

401785 7249190 Mhus 165776-165998.166189-166314,166408-1665^.167112-167268.167387-167469.168634-168942 

401994 4153858 ItlBnus 42904-43124,43211-43336,44607-44763.4519»4S281,48337-46732 

msrs 8117407 Plus 121907-122035.122804-122921,124019-124161,124455-124610,125672-126076 

404996 6007890 Plus 37999^145.36652^8998,39727-39872.40557-40674.42351-42450 
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TABLE 13A.- Genes Distinguishing Non-M^ignantUmgOlseM from Lung Tuners and Nonnallung 



PCT/US02/12476 



Table 1 3B siitw the atxes^ numbers for those Pke/s lacUng UrdgenelD's Ibr table 13A. For each probeset we have Gsled the gene cluster nuniber from which Bie 
oOgonucleo&des were designed. Gene dusters were compiled usfng sequences deitved from GenbaikESTs and ntfVMs. These sequences were dustered based on ser^ience 
simaariV using Guslering and Alignment Tools (OouUeTwist Oakland CaEionda). The Genbank accession numbers for sequences comprising each duster are listed In (he 
"Accession' column. 

Table 1 3C show the genomic positioning for those Pke/s lacking Unlgene IDfs ml accesskm numbers m laUe 13A. For each predicted exon, we have Hsted the genorrnc 



Pkey: Unique Eos probeset identiSer number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnlgenelO: Unlgene number 

UnlgeneTiUe: Uniger»genelille 

R1: Average of hinghimars Cmduding tquamous eel camtnomas, adenocarcinomas, sn 

average of normal lung samples 

R2: Average of non^nal^nantlimg disease samples pncludlnglin)nchBis,8mplqrsema, 



el eardnomas, graru^matous and cardnoid tumora) divided tiy the 
sis, atelectasis, asthma} divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigenelD 


UnigeneTffle 


R1 


R2 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA (or KIAA1S6B protein, 


1.00 


230.00 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


412372 


R659g8 


HS.28S243 


hypothetical protein FU22029 


1.00 


173.00 


415910 


U203S0 


Hs.78913 


chemokine (OX3-C) receptor 1 


1.0O 


145.00 


417511 


AL049176 


Hs.82223 


chortfin-like 


1.00 


179.00 


418819 


AA228776 


Hs.191721 


ESTs 


1,00 


140.00 


422060 


R208g3 


Hs.325823 


ESTs. Utodaately similar to ALUS_HUMAN A 


1.00 


156.00 


424585 


AA464840 


Hs.13ig87 


ESTs 


1.00 


167.00 


426753 


T83832 


Hs.170278 


ESTs 


1.00 


141.00 




AA453800 


Hs.192793 


ESTs 


1.00 


13S.00 


430719 


AA48S988 


HsJ93796 


ESTs 


1.00 


133.00 


431089 


BE04139S 




ESTs, Weakly simllar to unknown protein 


23.32 


941.00 


431385 


BE178536 


Hs.11090 


membrane-^Mnnlng 4-domains, subfamily A 


1.00 


157.00 


431728 


NM_0073S1 


Hs.268107 


mulflmerin 


1.00 


157.00 


436532 


AA721522 




gb3ivS4h12.r1 NCLCGAPJEwl Homo sapiens 


1.00 


218.00 


437960 


AI66953S 


H$.222194 


ESTs 


1.00 


147.00 


438202 


AW169287 


Hs.22588 


ESTs 


1.00 


141.00 


441499 


AW298235 


HS.1016B9 


ESTs 


1.00 


167.00 


444513 


AL120214 


Hs.7117 


glutamate receptor, tonolroplc. AMPA 1 


1.00 


151.00 


448253 


H25B99 


Hs.201591 


ESTs 


1.00 


141.00 


453G36 


R67837 


Hs.169872 


ESTs 


1.00 


116.00 


458332 


AI000341 


Hs.22049t 


ESTs 


1.00 


19Z00 


«i9587 


AA031956 




gb2k1Se04.3l SoaiesjiregnanUiterus_NU1 


1.00 


154.00 



Pkey: Unkfue Eos probeset idenViier nurrdier 
CAT number. Gene duster number 
Acoesskin: Genbank accession numbers 



CAT Number Al 

327B25J BE04139SAA491826AA621946AA71S980AAS66102 
421B02_1 AA721522AW975443T93070 



). Ths7disttnumbsi5lnltdscalirfflnan9Genbank)(ientffier(6l)mi(nbsrs. 'DunhainI.etal.'letoiothapubBcaBonentnied'TheONA 
sequence of human chromosome 22." Ounham L el al. Nature (1999) 402:489495. 
Strand: IndteatesONA strand from which exons were predicted. 
NLposte Indicates nudeolide positions of predicted exons. 

Pkey Ref Strand NtjwsiGon 

40207S 8117407 Plus 12ig07.122035,122804-122921,124019-124161,124455-124610.125672-12B076 
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Table 14A shows Ihe subceBular localizaBon and prefened ulDHy br (he genes appearing ta Tables 9A and 10A. ijiAb symbolizes monoetonai anflhody. diag syirtiolizes 
diagnostic s.iti. symbolizes smatl moteeute. and CTL symboHas cytokwfc lymphocyfic Bgand. Tliese genw were setecJed fmm 59680 ptobeseb on Ihe EosMflymeInx Ho03 
Genechlpairay. 

Table 14B show Ihe accession numbets for those Pkey's lacking UnigenelOrs (or lat^ 14A, For each probeset we have fisted the gene cluster number from which the 
ol^ucteoSdes were designed. Gene clusters were compiled usiifl sequences denVed fiom GenbanA ESTs and mRNAs. These sequences were duslered based on sequence 
similarity u^ Ckisledng and AlignmenlToals (OoubieTwist. Oakland CaOfomia). The Genbank accession nunters for sequences comprising eadi cluster are listed in Bie 
*jAcoe s ston* colunin. 

IS in table 14A. For each preddedexon.m have listed Oki genomic 



Pkey: 


Unique Eo: 


s probeset Idem 


user number 


ExAccn: 


Exemplar/ 


kocessibn numt 


ler, GeiAank accession mmber 


Unlgeneli 


9: Unigene number 




Unigene mUnigene gene We 




PieLmir. Ptaferredl 
PredXxMX Prcdictod ; 


JMly 

tubceUular local 




Ptey 




UnigenelD 


Unigene TKe I 


4002S9 


X07a20 


Hs.2258 
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Unique number correspwiding to an Eos pwbesat „ 

Sequence source. The 7 digU numbers in Oris column are Genbank Identifier (GI) numbers. -Qunliam I. et d.' raters to the pubOcatna enfiUed Hie DNA 

sequence of human dmnusome 22: Ounham I. et at.. Nature (1999) 402:489495. 
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Table ISA shows flie Seq ID No. Pkey. ExAccn. UnlgenelDi and Uidgene THIo for all of Bie sequences In TaUe 16. 

TaWelSBshowlheaccesslbnnumbeislbrlhosePkey^laii^ Fb, eadtprobeset we haw list^the gene duster numtotom^ 

oltaonucleaSdes were designed. Gene dusters were oom^ using sequences derived liom Genbank ESTs and mRNAs. These sequences wens duslHed bawd on sequsn 
sl^aarity using Oustering and Alignment Tods (DouhteTwisl. Oakland Cafflbmla). The Genbank aocesston numbefs fcr sequences comprising eadiduster are Isted in the 



Table 15C show Ihe gencnic poslfionii« for (hose Pke/s lacking Unyene ID'S and accesskw numbers In labia ISA. For eaCh pia&ted oxon. we haw listed the genomic 
sequence source used forprediclkm. Nucteoade kxaSons of each predUed exon are also fisted. 



Seq ID No: Sequence ID number 
Pkey: Unique Eos probes^ I 
ExAccn: Exemplar Accesston r 
UnigenelD: Unigene number 
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SeqlDNo:1&2 
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SeqlDNo:S&6 
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Seq ID No: 9 & 10 
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SeqlDNa13&14 
Seq ID Na IS & 16 
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SeqlDNa21&22 
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Seq ID No: 25 & 28 
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Seq ID fki: 102 4103 
Seq ID No: 104 4 105 
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Seq ID No: 127 a 128 414430 
SeqlDNo:129&130 
Seq 10 No: 131 & 132 
Seq 10 No: 133 & 134 
Seq ID No: 135 & 136 
Seq to No: 137 & 138 
Seq 10 No: 139 & 140 
Seq 10 No: 141 &U2 
Seq 10 No: 143 & 144 
Seq(DNa145&146 
Seq [D No: 147 & 148 
Seq ID No: 149 & ISO 
Seq ID No: 151 & 152 
Seq 10 No: 153 & 154 453884 
8eqlDNo:1SS&156 453884 
Seq ID No: 157 a 158 
SeqIDNo: 159 a 160 
Seq 10 No: 161 a 162 
Seq ID No: 163 a 164 

Seq ID No: 165 a 166 

Seq to No: 167 a 168 

Seq 10 No: 169 a 170 

Seq ID No: 171 a 172 
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Seq ID No: 197 a 198 

Seq ID No: 199 a 200 
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Seq ID No: 203 a 204 

Seq ID No: 205 a 208 

Seq ID No: 207 a 208 
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Seq ID No: 217 a 218 427335 
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Seq 10 No: 221 a 222 
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Seq'IDNac225 a228 
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Seq ID Na 229 a 230 
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SaqlONa266 a267 
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Seq ID No: 272 a 273 
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Irizzled (Orosophiia) homolog 6 
KIAA017S gene product 
secreted phosphoprotein 1 (osteoponlin, 
SRY (sex d^rrrining region Y)^ox 1 1 
bone nwrphogenetlc protein 7 (osteogenic 
parathyroid hormone receptor 2 
peraHiyroid bonnone receptor 2 



PTK7 protein tyrosine Idnase 7 
ESTs. WeaWy similar to JC7328 amino aci 
ESTs, WeaWy similar to JC7328 amino aci 
ESTs, Weakly sbnlar to JC7328 amino aci 
ESTs. Weakly similar to JC7328 amino ad 




Accession 

M27826 R78416 AA307645 AW957879 AW957800 AA633529 H03662 

AL133916 1^113 AF086101N76721AW950828AA364()13AW9S5684AI346341A18674S4N54784AI655270AI421 279 AW014B82 
AA77SSS2 N6235t NS9253 AA626243 AI341407 8E175639 AA456968 AI3S8918 AA4S7077 

AA00g647AA1312S4AA374293AW9544O5H0441OAW6a62a4AA1S1166BE157467BE157601 H043B4W46291 AW663674 H04021 H0f532 
AA190993 H03231 HS9eOS H01642 AA8S2876AA113758AA626915 AA7«952 AI161014 AA0995S4 R69067 
AW1 18072 /U631982 T15734 AA224195 AI701458 W20198 F26326 AAB90570 N90552 AW071907 AI6713S2 AI37S892 T03517 R88265 
AI124()88AA224388AI(a4316Al354686T33652Al140719AI720211T03490AI372637T15415AV¥205836AA63(J384 T03515T^^ 
AA017131 AA443303 T33623 A1222SS6 T33511 T33785 MiSm 055612 



Re£ Sequence source. The 7 digftnumtaem in this oolunin are 6enbankldenGlier(GI) numbers. *Dunliaml.etal.' 

secpnnce of tiuman dyomosame 22.' Ounliam I. et aL, Nature (199^ 402:489495. 
Sband: Indicates DNA strand Iromwtiicli axons ware predicted. 
Ntjiosltion: Indicates nudeolide positions of preificted axons. 
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Table 16 

Seq ID NO: 1 DNA sequence 
Nucleic Acid Accession fts NM_00iai6 
5 Coding sequence: 43.. 1422 

1 11 21 31 41 51 

I I I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GC3VTGGCTCC CCTGTGCCCC SO 

10 AGCCCCTGGC TCCCTCTGTT GATCXXX3G0C CCTGCTCCRG GCCTCRCTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCOGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCIGGGGAA GATGACCCAC TGGGCX3AGGA GGATCTGCCC 240 

AGTGAAGAGQ ATTCACCCAQ AGAGQAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 300 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAQTTAA6C CTAAATCAGA AGAAGAGGGC 360 

15 TCCCIGAA6T TAGAGOATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

AATAATGCXX: ACAGGGACAA AOAAGGGGAT GACCa«3AGTC ATTGGOGCTA TGGAGGCGAC 480 

CCGCCCTG6C CCCGGGTGTC CCCAGCCTGC GOGGGCCXSCT TCCAOTCCCC GGTGGATATC S40 

CGCCCCCAGC TCGCCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAQ 600 

CTCCCGCCGC TCCCAGAACT QCS3CCTGCGC AACAATGGCC ACAGTGTGCA ACTC3ACCCT6 660 

20 CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAGT ACCGGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TOGTCXaSGGC TOGGAGCACA CTGTGGAAGG CCACC5GTTTC 780 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 840 

GGGOGCCCGG GAGGCCTGGC CGTGTTGGCC GCXrrTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AOTGCCTATG AGCAQTTQCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

25 CAGGTCCCAG GACTGGACAT ATCTGCRCTC CTGCCCTCTG ACTTCAfiCCG CTACTTCCAA 1020 

TATGAGGOGT CTCTGACTAC ACCGCXXTTOT GCCCAGGGTG TCATCTQGAC TOrGXTTAAC 1080 

CAGACAOTQA TGCTGAQTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTOGGOACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GOGACGCawSC CTTTOAATGG GCQAOTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCQGG CTGCTGiAGCC ASTC CAGCTG 1260 

30 AATTCCTGCC TGGCTGCTGG TGACATCCTA GCXCTGOTTT TTGGCCTCCT TmBCTOTO 1320 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGOAAC CAAAS3GGQT 1380 . 

GTGAGCTACC GCCX^GCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCItSGA TCTT GOftSAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTQAGGG GGAGCOGGTA ACTGTCCTGT CCTGCTCATT 1500 

ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 

Seq ID NO! 
Protein Ac 

1 11 21 31 41 51 

40 ! 1 I 1 I I 

MAPLCPSPWIi PI.LIPAPAPG LTVQIXLSLL LLMPVHPQRL PRKQEDSPLG GGSSQEDDPL 
GEEDLPSEED SPREEDPPGE EDLPGEEDIiP GEEDLPEVKP KSEEEGSLKL EDLPTVBAPG 
DPQEPQmiAH HDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSFVDIRPQL AAFCPALRPL 
ELLGPOLPPL PBIJiraUJNGH SVQLT1.PPGI. EMAIiGPGREy RALQLHIflWG AAGRPGSEHT 
45 VE^aiRFPAEX HWHIiSTAFA RVDEALGRPG GLAVIiAAFLE EGPEEMSAirE QLI.SRI1BEIA 
EEGSETQVPG LDISALIiPSD FSRYFQXEGS I.TTPPCAQGV IWTVFHQTVM I£AKQLHTI.S 
0niWGP(3)SR I4UIFRATQP UfGRVIEASF PAGVmSSFIlA AEFVQLHSCL AAGDILALVF 
GUJPAVTSVA FLVQMHRQHR RGTKGGVSYR PAEVAETGA 



Coding sequence: 43B-: 



1 11 21 31 41 51 

55 I II 1 I I 

AQCGGGOTTG TCTATTAACT TOTTCAAAAA GTATCAGGAG TTGTCAAGGC AGAOAAGAGA €0 

GTSTTTOCAA AAGGOGGAAA GTAGTTTQCT GCCTCTTTAA GACTAGGACT GAOAGAAAGA 120 

AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAGCCXX3VG GCTTAAGCCT TTCCAAAAAA 180 

TAIVTAATAAC AATCATCGGC GGCGGCAGGA TCGGCCAGAG GAGGAGGGAA GCGCTTTTTT 240 

OO TGATCCTGAT TCCAGTTTGC CTCTCTCTTT TTTTCCCCCA AATTATTCTT OGCCTGATTT 300 

TCCTCGCGGA GCCCTGCGCT CCCGAC3VCCC COSCCCGCCT CCCCTCCTCC TCTCCXCXrCO 360 

CCCGCX3GGCC CCCCAAAGTC CCGGCOSGGC CGAGGGTCGG CGGCCGCCGG OGGGCCGGGC 420 

CCGOSCACAG CX3CCCGC3VTG TACAACATGA TGGAGAOSGA GCTGAAGCCG CaSGGCCXBC 480 

AGCAAACTTC GGGGGGCGGC GGCGGCAACT CCACCGQGGC GGCGGCXX3GC GGCAACCAGA 540 

o5 AAAACAGCCC GGACCGCGTC AAGCXXSCCCA TGAATGCCTT CATGGTGTGG TCCCGOSGGC 600 

■ AGCX3GCGCAA GATGGCX:CAG GAQAACXXXA AQATGCACAA CTCGGAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTOTCGOAGA CXWAGAAGCG GCCGTTCATC GACHAGGCTA 720 

ASCaGCIGOQ AGCOOTGCAC ATGAAGGAOC ACCCJOGATTA TAAATACOGG OCCOGGCGGA 780 

AAACCAAGAC GCTCATGAAG AAGGATAAiGT ACACGCTOCC CGGaSaGCTQ CTOGCCCCOG 840 

70 GCGGCAATAG CATGGOQAGC GQGGTCXJGGG TGGGCGCCOG CCTOGSOaCS GGCOTOAACC 900 

AGOGCATGGA CAGTTAOGCG CACATGHAOS GCTOaAaCAA CGGCAGCTAC AGCATGRTQC 960 

AGGACCAGCT GGGCTACCXK CAGCSVCCCQS GCCTCAATGC GCAGSGOSCA GCQCAGATGC 1020 

AGCCCATGCA COSCTACGAC GTGAGCX3CCC TGCAGTACAA CTCC3VTGACC AGCTCGCAGA 1080 

CCTACaWXSAA CGGCTCGCCC ACCTACAGCA TGTCCTACTC GCAGCRGGGC ACCCCTGGCA 1140 

75 TGGCTCTTGQ CTCCATGGOT TGGGTGGTCA AGTCCGAGGC CAGCTCCAGC CCCCCTGTGG 1200 

TTACCTCTTC CTCCCACTCC AGQQCGCCCT GCCAGGCCGG GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGOGCC GAGGTGCOGG AACCCXiCCGC CCCCAGCAOA CTTCACATGT 1320 

OCCAGCACTA CCAOAGGGGC CCX3GTGCCCG GCaWWSCCAT TAACGGCACA CTBCtXXrTCT 1380 

CACACATtna AGGGCOaaAC AOCQAAjCTQa AGGaGGGAGA AATTTTCAAA GAAAAAOGAG 1440 

oO OQAAATGGGA GGGGTGCAAA A6AGGAGAGT AAGAAACAGC ATGGAGAAAA CCXS38TACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAATC CCATCCACAC TCACGCAAAA ACCX3CGATGC CX^CAAGAAA ACrTTTATGA 1620 

GAGAGATCCT GGACrTCTTT TKGGGGGACT ATTTTTGTAC AGAGAAAACC TGGGGAGGGT 1680 

GGGGAGGGOS GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CGAAAAACTT 1740 

85 TTTAAAAGTT CTAGTGGTAC GGTAGGAGCT TTGCAGGAAG TTTGCAAAAG TCTTTACCAA 1800 

TAATATTTA6 AGCTAGTCTC CAAGOGACX3A AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAQTA TTTATOQAGA TAAACATGGC AATCAAAATO TCCATXGrrT ATAAGCTGAG 1920 
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AATTTGCC3VA TATTTTTCAA GGAGAGGCTT CTTGCTGAAT TTTGATTCTG CAGCTGAAAT 1980 

TTAGGACAGT TGCAAACGTG AAAAGAAGAA AATTATTCAA ATTTGGACAT TTTAATTCTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGG CQAACX»TCT CTQTGCfrCTT 2100 

GTTTAAAAAG GGCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAATGGCXaiT GCAGGTTGAC ACKGTTaSTA ATTTATAATA GCTTTTGTTC GATCCCAACT 2220 

TTOCATTTTG TTCAGATAAA AAAAACCATG AAATTACTGT GTTTQAAATA TTTTCTTRTG 2280 

GTTTGTAATA TTTCTGTAAA TTTATTGTGA TATTTTAAGG TTTTCCCCCC TTTATTTTCC 2340 

aTAGTTaTAT TTTAAAAGAT TCGGCTCTGT ATTATTTGAA TCAGTCTGCC GAOftATCCAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AQTTTTTACT 2460 

CCATTATGCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACTQAA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACA6AAAA AACAAAAAAA AAAACAAAAC 2580 

CACAACACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 2640 
CCACAACACA AACAACAACA CACAGAGGG 



X 11 21 31 41 51 

I I I I I I 

KYNMMETELK PPGPQQTSGG GGCTSTAAAA GGNQKNSPDR VKRPMSRPKV WSROORHKMA 
QENPKMHNSE ISKRliGAEWK LLSETEKRPF IDEAKRLRAL raOCEHPDYiaf HPHRKTKTLM 
KKDKYTIiPGG IiIAPGCWSMA SGVGVGAGI^ AGVNQRMDSY AHMHGWSNQS Y^IMQOQLGY 
PQUPQUJAHG AAQHQPMHRY DVSALQYNSM TSSQTYMNGS FTYSMSYSQQ GTPGMAICSM 
OaWKSEASS SPPWTSSSH SRAPCQACa^I. KOMISHYIiPG AEVPBPAAPS RUIMSQHYOS 



Seq ID NO: 5 DHA sequence 
Muclelc Acid Accession #: 1I9161B 
Coding sequence: 29-541 

1 11 21- 31 41 51 

I I I ! I I 

CXSGACTTGGC TTGTTAGAAG GCTGAAAGAT GATGGCAGGA ATGAAAATCC AGCTTGTATG 
CATGCTACTC CTGGCTTTCA GCTCCTGGAG TCTGTGCTCA GATTCAGAAG AGGAAATGAA 
AGCATTAOAA GCAOATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 
TCCCTCTTGG AAGATCSRCTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAQCCX: 
AGCTGAGGAA AC3VGGAGAAQ TTCaTGAAGA GGAGCTTGTT GCAAGAAGQA AACTTCCTAC 
TGCTTTAGAT GGCTTTAOCT TGOAAGCAAT QTTOACAATA TACCAGCTCC ACAAAATCTG 
TCACAGCAGG GCTTTTCAAC ACTGOGAGTT AATCCAOGMA GATATTCTTG ATACTQOAAA 
TGACAAAAAT GGAAAGGAAG AAGTCATAAA GAGAAAAATT CCTTATATTC TGAAACGGCA 
GCTGTATQAG AATAAACCCA QAAGACCCTA CATACTCAAA AGAGATTCTT ACTATTACIO 
AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 
ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 
ATTGAATGTG TTTTTCTGCA CTAATAGAAA TTAGACTAAQ TOTTTTCAAA TAAATCTAAA 
TCTTCAAAAA AAAAAAAAAA AAATGGGGCC GCAATT 



MMAOQCIQLV QOihliASSStl SLCSDSEEEM KAI.BADFI1TN MBTSKISKAH VPSVnOmiLtl 60 

VCSI.VNNUIS PAEBTGBVHB BELVARRKCiP TAUJOFSLEA MI.TIYQI.HKI CHSRAFQHWE 120 
ItlQEDILDTG NDKNGKEBVI KRKIPYILKR QLYENKPRRP YZUCRDSYYY 

Seq ID HO: 7 DHA sequence 

nucleic Acid Accession «: NM_006536.2 

coding sequence: 109-2940 

1 11 21 31 41 51 

ACCTAAAACC TTCCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTGACACA 60 

AIOTATGCAQ CAGGCTCAGT GTGAGTGAAC TGQAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAO GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAQAACXTTCA TCTCAAACAT TAAGQAAATG 300 

ATAACTGAAO CTTCATTTTA CCTATTTAAT GCTACCAASA GAAOAGTATT TTTCAGAAAT 360 

ATAAAGATTT TAATACCTGC CACS^TGGAAA GCTAATAATA ACIUSCAAAAT AAAACAAQAA 420 

TCATATGAAA AGGCAAATGT CATAGTQACT GACTGGTATG GGQC&CATGO AGATOATCCA 480 

TACACX:CTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CA^CCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TACX3GATCAC GAGGC08AGT GT TTOT CCAT 600 

GAATGG6CCC ACCTCXSTTTG GGOTGTGTTC GATGAtSTATA ACAATGACAA ACCTTTCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGIGIGAAA AAGGTCCTTG CCCCCARGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTQ CATCAATAAT GTTCATGC3iA B40 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA A6CACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 

TTTCACCACA GCTTTCCCAT GAATGGGACT GAOCTTCCAC CTCCTCCCAC ATTCTGGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TCTCCAGCAA GATGGCAGAG 1080 

GCTGACAQAC TCCTTCAACT ACAACAAGCC GCaUjAATTrr ATTTGATGCA GATTGrTGAA 1140 

ATTCATACCT TCGTGGGCAT TGCCAGTTTC GACAGC AAAG GAGAGATCAQ AGCCCSW3CTA 1200 

CACCAAATTA ACAGCAATGA TGATCGAAAG TTGCrGGTTT CATATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAGACATCAG CATTTOTTCA GGGCTTAAGA AAGOATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TQGCTCTGTG ATSATATTAG TGACCAGCGG AGATGATAAG 1380 

CtTCTTGGCA ATTGCTTACC C»CTGTGCTC AGCAGTGGTT CaU«»ATTCA CTCCRTTGCC 1440 
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CTGGGTTCAT CTGCAGCCCC 
TTCTTTGTTC CAGATATATC 
TCTGGRACTG GRGAC3^TTTT 
AAACSrrCACC ATCAATTGAA 
ATOTTTCTAQ TTACGTGGCA 
GGAOSAAAKT ACTACACAAA 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



3 GAATTATCAC 
r AGCATGATTG 
r ATTCAGCTTO 



AAACaVCAGTG 
GGCCAQTGGT 
TAATTTTATC 



GCCACTGTGG 
TATGCCAATG 
GAGCCAGAGA 
QTTATAAAAA 
TATAGCTTGA 
CCAOGGAGTC 
GCTCCAAQGA 



CCACCATGCA 
TGGACAGCAC 
AGTAAAAGTC 
AAGCGAAATC 
ACGAATGGAC 



ATAAATATCC 
CATACTAACA 
ATACAGATAA 
CCTTACACTT 
GCAAAG6SAA 
AATAGCCCXA 
TCATTTAGTT 
TTTACATCAA 
CTTGCTATTT 
TTTCACTGTA 
TTTATGACAA 
TTTCTAAQTT 



CCCTGAAAGT 
AAGCCTTTGT 
TGAAACAGGG 
CTGGAGATCC 
ATGATGGAAT 
AAGTGCATGT 
ATGCTATGTA 
AATCAGTAGG 
GCTCCTTTTC 
AAATTATTGA 
CTGGAGAAGA 
TACAGAATAT 
CTCAGCAAGC 
CTGAACATCA 
CAATGGATAG 
TTCCCCCX»A 
CAGCAATGGG 
GCAGGAAAAA 
AAflGTCTCTT 
AAGTCAAATT 
GATTTTTACA 
TGGCTATGAA 
GGGTAAAOTC 
AGCAGAGAAA 
ACTTTQATTA 
GATCATGCTA 
TGTTATATAT 
AGAGGTAACC 
AGGTCTATTG 
TATTGCCTTG 



ATTTTATCCC 
TGTTACGCTG 
TTACTCGAGG 
CAATCACTCT 
TGTACCAGGT 
CAC3AAATGAG 
AGTGCTGGGA 



CCTCCTGAGA 
ACCAATCTAA 
TGGACTTACA 
TCTCGCQCCT 
AGCCTCCATT 



CTTTGATCAG 
CCAAGATGAC 
TGGCATCSW3G 
GCCAAATGGA 
GAACTCCTTA 
TTCTGATCCT 
TTTGATAGGA 



W3ACTCCTTG 
TATTTTTTCT 
CCCAGCATAA 
TACACAGCAA 
GAGGAGCGAA 
GTTCXaiGCTG 
GTAAAAGTAG 
GGCCAGGCTA 
TTTAACAATG 
GAGATATTTA 
GAAACACAT6 
CAGTCTGCTQ 
GTACCTGCCA 
ATCaTTTGCC 



CCTTCTTAGA 
AACATCAAAA 
TGGTAGATCA 



GOACCAiQTOT 



TATAAGACCC 
CTGTATTAAA 
ACAATTCTTT 
AAATTAITCT 
CAAaOAAAOT 
GTCTGCATTA 
TCTCCTTATC 
ATGTAOCCCC 
ACATCTCCCT 
TGGGTATTAC 



GTCTTACAGG 
ATGCTTTCAG 
AAAGTACAGG 
ATACTGTGGG 
TTATATTATT 
CTTTTOGGAC 
CCCTGAACAA 
CCAACTCAGC 
TTCCTCATCC 
CCACTGTCAC 
ATGATGGAGC 
CCTTTGCTGC 
GCACCCCAAC 
AOGGTAATAT 
AGTGGGGCTT 
GCCCCCACCC 
AAGAGGAATT 
CAAQCTATGA 
CTATTTTAGT 
CGTTCTCACC 
AAASCCACAO 
TATCTAACAT 
GAGATTATCT 
TTATTATAGT 
ATGGAACAAA 
ATGGCCTTCG 
ATGCATTGAG 



AGGTTTAAAG 
TAGAATTTCC 
TGAAAATGTC 
CAACGACACT 
TGATCCTGAT 
AGCTAGTCTT 
TACCXaiTCAT 
TGTGCCCCCA 



TTAAAOTAAT 
TTQTTTTATT 
TAACTGTCTG 
TGTGCAGTAC 
TAATGCAAAG 



TGCCACAGTT 
AGGTGCTGAT 
AAATGGTAGA 
CCACTCTATT 
TCAGATGAAT 
TAGCCGAGTC 
TGATQTGTTT 
GACCCTATCT 
AATAAGAATG 
AAATACATCA 
CCAGATTTCC 
AATTTATQTT 
T6CCCAGGGG 
TATATTGAAA 
TGTGACACAT 
ATTATTATAA 
ACTAC3«AAA 
TrTTTGTACA 
ATTAGAAAAC 
GTCTTTAAAG 



ATTTTTCTTT 
TATTTTATAT 
ATTTCAGATG 
TTTAACAATA 
AATTTATTTG 
GGTTATTATG GAATGATAGT TATAGCCCCM 



TGTCAAGCAA 
AGGTTGCTTG 
CTCTTTACCT 
AGAGATCTTT 
TCATACCGGT 
TCAAAGCAGC 
TATA&TGCCT 



1500 

iseo 

1620 
1680 



2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2S80 
2640 
2700 
2760 
2820 
2680 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3S40 
3600 



PCT/US02/12476 



I 

MTQRSIAGPI 
IKEMITEASF 
GDDPYTKIVR 
KPFYIHGQNg 
MEMQSI,SSW 
TFSLVOAGDK 
RAQLHQIIiSN 



YlaFHATKRRV E 



SRISSGTGDI 
PDPIX3RK5fYT 
AVPPATVEAP 
AGADVIKNDG 



WCLVLDVSS 
DDRKIiVSYI. 
PTVLSSGSTI 
FQQHIQIiBST 
NMFITNLTFR 



lYSRYFFSFA 



TGIl 
BAPHLQHQNC 
KMAEADRIJO 

pttvsaktdi 
hsiaj:jGSSAa 

GENVKPKHQIi 
TASLWIPGTA 
PVMIYANVKQ 
AHGRYSLKVH 



I I I 

AGVQLQDNGY NGUiIAIMPQ VPENQNIilSN 
ATWKANNNSK IXQE8YEKAH VIVTDWYGAH 
liTAOVaSRGR VFVREHABLR WGVFDBYHHD 



LILKGVLTAM G 



CTCCCCTCAC 
CCTGGGTGGG 
GAGCTGGCAC 
CAGGGTTTGG 



KNTVTVDNTV 
KPOHWTYTIjN 
GFYFIIiNATV 
VNKSPSISTP 
SVU3VFAOPH 
IQDDFtmAIL 
KNSLQ&AVSN 



QIVEIHTFVG 
EWEKLNGKA 
GGLKFFVPDI 
GMDTMFI.VTW 
NTHHSLQALK 
TATVEPETGD 
AHSIPGSHAM 
PDVPPPCKII 
VNTSKRNPQQ 
lAQAPLFIPP 



MNGTELPPPP 
IASFDSK6EI 
YGSVM11.VTS 



QASGPP&IZIi 
VTVTSSASNS 
PVTLRLLDDG 
YVPGYTANGN 
DLEAVKVEEE 
AGIREIFTP3 
NSDPVPARDY 



GGGTCTGTCT 
CXSCTGGCTGT 
AGCTGAGTAA 
AGAAAGTGGA 
AGCAGGTGGA 
ACTTCTTCCA 
TCTCTTGGGC 
TGTTGKTAAT 
CTGGGAGATC 
TCTCCAAGQC 



TGGGATCAGG 
CCACATATAA 
CTGCCACCTG 
GCTGGTCACT 
GGGGGAAATQ 



21 

I 

3 GATGCX;CAGT 
r GCCCTTGACC 
GGAGGGGGCT 
TTGAGGCAGG 
ATCCTCACCC 
GTCTQCCACA 
ACCTTCCACA 
AAGOAACTTC 



CTTCCAGGAG TATGCTGTTT 



I I 1 

CCCCACGACA CCICCCACTT CCCACTGTGG 
TGQCCmSAG CCCTCCOCCA G 
OGQAOOaAAT aAGTaGOAAT G 
TTTGGTTTCC TTAAA ATGC C AAGTTGGGGG 
TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 
GATCCATGAT GTCCAGTTCT CTGOAGCAGG 
AGTACTCCTG CCAAGAGGGC GACAAGTTCA 
TGCACAAGGA GCTQCCCAGC TTTGTGGGG6 
TGATGGGCAG CCTGGATOAG AACABTGACC 
TCCTGGCACT CATOUrrGTC ATGTGCAATG 



CCAGGACTGT TGATGCCTTT GAGTTTTQTA TTCAATAAAC TTTTTTTGTC 
ATTTTAATTG CTCAOTGATQ TTCCATAACC CGGCTGGCTC AGCTGGAGTO 
AGGGOCTCCT GGATCCTGCT GCCTTCTGOQ CTCIGACTCT CCiaOAAATC 
CAQAGCTATG CTTTAGGTCT CAATTTIGKsA ATTTCUWCA CCAGCAAAAA 
GAGATAOOTT OCTGACTTTT ATTTTGTCAA ATAAAGATAT TAAAAAAGGC 



AAATACCA 
Seq ID NO: 10 Protein sequence: 



191 
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Protein Accession 8: KP_005969.1 

1 11 21 31 41 51 

I I I I I I 

MMCSSLBQAI. AVLVTTFHm SCQEC30KFKI. SKGBKKBLLH KEIiPSFVGEK VDBEGIiKKLM €0 
GSLDEMSDQQ VDFQEYAVPI, ALITVMOIDF FQGCPDRP 

Seq ID HO: 11 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 336-626 

1 11 21 31 41 51 

) I 1 I I I 

CTCCCCTCAC CXXGGTCCAG GATGCCCAGT CCCCACGACA CCTCCCACTT CCCACTGTGG 60 

CCTGGQTGGG CTCAGGGGCT GCCCTTGACC TGGCCTAGAG CXXTTCCCCCA GCTGGTGGTG 120 

(SAGCTGGCAC TCTCTGGGAG GGAGOSGGCT GGQAGGGAAT GAGTGGQAAT GGCAAGAQGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 240 

CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGASCCTG GCIGCCTTGC TCTCCTTCCT 300 

CGGTCTGTCT CTOCCACCTG OTCTGCCACA CATCCATQAT GTCCAOTTCT CrOQAaCAGG 360 

CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTQ CCAAGAGQGC GACAAGTTCA 420 

AGCTGA6TAA GGGGGAAATG AAGGAACTTC TOCACAAGGA GCTGCCCAOC TTTGTQGGGC 480 

ATTCCAGAGA ACCATGTGCT GTGAGGGCCT TCCGAGTCCA TCTGTTTAAT CCTGTCATTQ 540 

GA6ACTTGAQ AAACCAGAGC CCAGAAGGGA AAAGTGATTG TCCCAAGATC ACAC3VGCACT 600 

GGA6AAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGQGC AGCCTGGATG AGAACAGTGA 660 

CXaVGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA CTCATCACTG TCATGTGCAA 720 

TOACTTCTTC CAGGGCTGCC CAGACCQACC CTGAAGCAGA ACTCTTGACT TCCTGCCATG 780 

GATCTCTTGQ GCCCAQ6ACT GTTGATGCCT TTGAGTTTTG TATTCAATAA ACTTTTTTTG 840 

TCTOTTQATA ATATTTTAAT TOCTCAGTOA TGTTCCATAA CCCXSGCTGGC TCAGCTCGAG 900 

TGCTGGGAGA TGAOQaCCTC CTGOATCCTG CTCCCTTCTQ GGCTCTGACT CTCCTGGAAA 960 

TCTCTCCAAG 6CCAGAGCTA TGCTTTAGGT CTCAATTTTG GAATTTCAAA CACCAGCAAA 1020 

AAATTGGAAA TOSAGATAGG TTGCTGACTT TTATTTTGTC AAATAAAGAT ATTAAAAAAG 1080 



MHCSSLEQAL AVLVTI 



Seq ID NO: 13 DNA sequence 

kucleic Acid Accession Eos sequence 

Coding sequence: 58-354 

1 11 21 31 41 51 

I i I I I I 

GTGAGCrCAC CATGTGGGGG TGAGGCTGAG AGAAAACAA6 TACACAGCCA CAOATCCKTG 
ATGTGCAGTT CTCIGQAGCA GGCGCIGGCT GTraCTGGTCA CTACCTTCCA CAAGTACICC 
TGCCAAGAGQ GOJACAAGTT CAAGCraAGT AAGGGGGAAA TCAAGGAACT TCTGCACAAQ 
GAGCTGCCCA GCTTTGTGQQ GGAGAAAGTG GATQAGGAGQ GGCTGAAGAA GCTGATGGGC 
AGCCTGGATG AGAACAGTGA CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA 
CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCQACC CTGAAGCAGA 
ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 
TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA 
CCCGGCTGGC TCAGCTCGAG TGCTGGGAGA TGAGGGCCTC CTGOATCCTG CTCCCTTCTG 
GGCTCTGACT CTCCTGGAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG 
GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 
AAATAAAGAT ATTAAAAAAG GCAAATACCA 



I I I I I I 

MMCSSLEQAIi AVIiVTTFHKY SCQEGDKFKL SKGEMKEIiIiH XELPSFVGBK VDEBGIiKKLM 
GSLDENSOQQ VDFQEXAVFIi AIiITVMCSlDF FQGCFDRP 

Seq lO MO: 15 DMA Sequence 

nucleic Acid Accession #: Eos sequence 

Coding sequencet 62-358 

1 11 21 31 41 SI 

I I I I I I 

GGAGGGTGTG COGCTGAGTC ACTGCCTGGO CATCTGGGCC TGGAACCTCG GCCACAGATC 
CATGATGTGC AGITCTCTGG AGCAGGC6CT GGCTGTGCTG GTCACTACCT TCCACAAGTA 



GGCACTCATC ACTGTCATOT GCAATGACTT CTTCCAGGGC TGCCCAGACC GACCCTGAA6 
CAGAACTCTT GACTTCCTGC CATGGATCTC TTGGGCCCAG GACTGTTGAT GCCTTTGAGT 
TTTGTATTCA ATAAACTTTT TTTGTCTGTT GATAATATTT TAATTGCTCR GTGATGTTCC 
ATAACCCGGC TGGCTCAGCT GGAGTGCTGG GAGATGAGGG CCICCTGGAT CCTGCTCCCT 
TCTGGGCTCT QACTCTCCTG GAAATCTCTC CAAGGCCAGA GCTATGCTTT AGGTCTCAAT 
TTTGQAATTT CAAACACCAG CAAAAAATTG GAAATCGAQA TAGOTTGCIQ ACTTTTATTT 



192 



Protein Accession 9 

1 11 21 31 41 51 

I I 1 I I I 

MMCSSI-EQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLiJ KELPSFVGEK VDEEOLKKLM 
GSLDEHSnOQ VDFQEYAVFL ALITVMCMDF FQOCPDRP 



Seq ID NOt 17 DMA sequence 

iruclelc Acid Accession «: Eos sequence 

Coding sequence; 939-2372 

X 11 21 31 41 SI 

I - I I I I I 

AAGACGGATT CTCXOACAAG GCTTGCAAAT GCCCCQCAGC CATCATTTAA CTGCACCOSC SO 

AGAATAGTTA CGGTTTGTCA CCOGACCCTC CCSQATOQOC TAATTTGTCC CTAGTGAGAC 120 

CCCGAGGCTC TOCCCGOGCC TGGCTTCTTC GTAOCTGGIIT GCATATOSTG CTCCGGGCAG 180 

CGOQGGCGCA GGGCACGCGT TCGCGCACac CCTAGCRCAC ATQAACACGC GCAAGAGCTG 240 

AACCSVAGCAC GGTTTCCATT TCAAAAAGGG AGACAGCCTC TACCGCX3MT QTAOAAGAGA 300 

CTGTGGTGTG AATTAGGGAC OGGGAGGCGT CGAACGGAGG AACGGTTCAT CTTAGAGACT 360 

AATTTTCTGQ AGTTTCTGCC CCTGCTCTGC GTCAGCCCTC ACGTCACTTC GCCflGCAGTA 420 

GCAGAGGCX36 CXX3CGGCGGC TCCCGGAATT GGGTTGOAOC AGQAGCCTCQ CTGGCTGCTT 4B0 

OGCTCGOGCT CTAOSCGCTC AOTCXXXaSGC GGTAGCAGGA GCCTGGACCC AGGCGCCGCX: 540 

GGCGGGOGTG AGGOQCOGOA GCCOGGCCTC GAGGTGCATA CCGGACCCCC ATTCQCATCT 600 

AACaAOGAAT CTGCOCCCCR GAGAGTCCOG GGftGCGCCJGC OGGTCGGTGC COGGCGCGCC 660 

GGOCCATGCA GCGACGGCOG CCBCGQAQCT CC!GAGCAGCO GT»3CGCCCC CCTGTAAAGC 720 

GGTTCGCTAT GCOGGGGCCA CT6TQAACCC TQCCXSCXTTGC OGGAACACTC TTCOCTCOG6 780 

ACCRGCTCAG CCTCTGATAA GCTGGACTCa GCACGCCCGC AACAAGCACC GAGGAOTTAA 840 

GAGAGCCGCA AGCGCAGGOA AGOCCTCCCC GCACGGGTGG GGGAAAGCGG CCGGTGCBGC 900 

GCGGGGACAG GCACTCGGGC TGGO^CTGGC TGCTAGGGAT GTOGTCCTGO ATAAGGTGGC 9S0 

AT6GACCCGC CATGGCGCa3G CTCTGGGGCT TCTGCTGGCT GGTTGTGGGC TTCTGGAGGG 1020 

CCGCTTTCX3C CTGTCCCACG TCCTGCauulT GC3M5TGCCTC TCGGATCTGG TGCAQCGACC 1080 

CTTCTCCTQG CATOGTGGCA TTTCCGAGAT TGGAGCCTAA CRGTGTAGAT CCTGAGAACa. 1140 

TC3U:CQAAA1 TTTCATOGCA AACCRGAAAA GGTTAGAAAT CATCAACGRA GATGKTGTTG 1200 

AAGCTTATCT GG6ACTGAGA AATCTGACAA TTGTGGATTC TGGATTAAAA TTTGTGGCTC 1260 

ATAAAGCKTT TCTGAAAAAC fiGCAACCTGC AGCACATCAA TTTTACCOOA AACAAACTGA 1320 

CGAGTTTGTC TAGGAAACAT TTCCX3TCACC TTaACTTGTC TGAACTGATC CTGGTGGGCA 1380 

ATCCATTTAC ATGCTCCTGT GACATTATGT GGATCAAGAC TCTCCAAOAO GCTAAATGCA 1440 

GTCCAGACAC TCAGGATTTG TACTGCCTGA ATSAAAaCAG CAAGAATATT CCCCTGGOA 1500 

ACCTGCAGAT ACCCAATTGT GQTTTGCCAT CTGCAAATCT GGCCGCACCT AACCTC34CTG 1560 

TGGAQGAAGG AAAGTCTATC ACATTATCCT GTAGTGTGGC AGGTGATOCG GTTCCTAATA 1620 

TGTATTGGGA TGTTGGTAAC CTGGTTTCCA AACATATGAA TGAAACAAGC CACACACAGG 1680 

GCTCCTTAAG OATAACTAAC ATTTCATCOG ATGACAGTGG GAAGCAGATC TCTTGTGTGG 1740 

CGGAAAATCT TGTAGGAGAA GATCAAGATT CTGTCAACCT CACTGTGCAT TTTGCACCAA 1800 

CTATCAC&TT TCTCGAATCT CCAACCTCAG ACCACCACTG GTGCATTCCA TTCACTGTGA 1860 

AAGGCAACCC CAAACXZAGCXS CTTCAGTGGT TCTATAACGG GGCAATATTG AATGAGTCCA 1920 

AATACATCT6 TACTAAAATA CATGTTACCA ATCACACGGA GTACCACGGC TGCCTOC3M3C 1980 

TGGATAATCC CACTCACATG AACAATG6G6 ACTACaCTCT AATAGCCAAG AATGASTATG 2040 

GQAAG6ATGA GAAACAGATT TCTGCTCACT TCATOGOCTQ GCCTGGAATT GACX3ATGGTQ 3100 

CAAACXKSiAA TTATCCTGAT OTAATTTATG AAGATTATCG AACTOCAGCO AATOACATOQ 2160 

GGGACACCAC GAACAOAAGT AATGAAATCC CTTCCACAOA CQTCACTGAT AAAACCGGTC 2220 

GGGAACATCT CTCGGTCTAT GCTGTGGTGG TGATTGCGTC TGTGGTGGGA TTTTGCCTTT 2280 

TGGTAATGCT GTTTCTGCTT AAGTTGGCAA GACACTCCAA GTTTGGCATG AAAGCTTTTG 2340 

TTTTGTTTCA TAAGATCCCA CTGGATGGGT AGCTGAAATA AAGGAAAAGA CAGAGAAAGG 2400 

GGCTGTGGTG CTTGTTGGTr GATGCTGCCA TGTAAGCTGG ACXCCTGGGA CTGCTGTTCO 2460 

CTTATCCCX3Q GAAGIQCTGC TTATCTGGGa TTTTCTGOTA aATGTGGGGS GTGTTIGGAQ 2520 

GCTGTACTAT ATGAAGOCI6 CATATACIGT GAGCTOTaKI TGGGGAACAC CAATOCAGAS 3580 

GTAACrCTCA GGCAGCIAAG CAGCACCTCA AGAAAACAT8 TTAAATTAAX GCTTCTCTTC 3640 

TTACAGTAGT TCAAATACAA AACTGAAATG AAATCCCaiTT GGATTGIACT TCTCTTCTGA 3700 

AAAGTGTGCT TTTTGACXXTT ACTGGACATT TATTGACTTA ATTGCTTCTG TTTATTAAAA 2760 

TTGACCTGCA AAGTTAAAAA AAAATTAAAG TTQAGAACAG GTATAAGTGC ACACTGAATA 2820 

GTCTAATCTA CATGTAACAC ATATTTTAGT GTGATTTTCT ATACTCTAAT CfiCCACTGAA 2880 

TTCAGAGGGT TTGACTTTTT CATCTATAAC ACAGTGACTA AAAGAGTTAA GGGTATATAT 2940 

ACCATCACTT TGGGACTTGG TAGTATTATT AAAAGGTTAT TTCCTTCACT GTCA&TAAAA 3000 

GTCCAAATGT TTAGCTTAOO TCT6AOAGTC AAACAATOTT AAGGATTQTC TTAAAflTTCC 3060 

TTAOOCAGCA AAACAAAACA AAACAAAACA AACAAATGAA AAAOSTTTAA AAAQAAOAAQ 3120 

AAGAAAAAAA AC3AQAACAA GCAGCAACAO CraTTTTBIT GQGGCIATAO ATrTAAQTTA 3180 

GGCATA6TCA ATTTCAGAAT AACTAAGAGT GGAATATATG CATATGGTGA AATTATRACX: 3240 

TTGCCCTTTT TTATTTGCCC TCTGCGATCC ACCTGCTTTT TAGAAGTCTG CCGAGT6AGA 3300 

AGGCCACAGT ATCTCATGCT QTTTGCATTA CRGAACTGC3V GCTTTTCTAC TCTQAAAAGG 3360 

CCTGGGAGCA GAATGGCTGG CCTGCTGTGA GCAGGAGAGQ AGATTCTAAQ AAGQATAGTC 3420 

CCCCCTACAA CATACTGTCA TACTGCTGGG TTTTCATGGG TAGGAAAGCT TGTCX3GACC 3480 

CCAGCAGCAA AGAaGTGQCA QQTCGCTAAT GAATATATGC TTTATAATGT CCTTCTTCAT 3S40 

TGCIGAGAGG GCAGCCTTAG AGCTGTGGAT JTCTGCATCC CCCCTGAaTC TOACCCATGG 3600 

ACACCTGrrr CATTCACTTT AGCATCACAG TGACCTTTGT ATGCTCTGTT C3U3TCTGTCT 3660 

CaSGCAOTAT GCTTerCClO AAGAflAaOTT TCGCTATCCC CACXXXaCCC CACCCCACCC 3720 

TOTTCCrTTT TTATCAGGAG GACTTCAGAG OCAGGCCTGC AGCATTTTGT TTGAAAACAC 3780 

AATCAGCTCT GACAGTTAGA CATGCACACA GAOGCCATAG CTGGATTGGA AACATTGATG 3840 

TTTTAAAAAT TTATTTTTTT TGGAAATAGT TGCACAAATG CTGCSVATTTA GCTTTAAGGT 3900 

TCTATAGATT TTTAACTAGT CCAACACAGT CAGAAACATT GTTTTGAATC CTCTGTAAAC 3960 

CAAGGCATTA ATCTTAATAA ACX:AGGATCC ATTTAGGTAC CACTTGATAT AAAAAGGATA 4020 

TCXATAATGA ATATTTTATA CTGCATCCTT TACATTAGCC ACTAAATACQ TTAT TGCTTG 4080 

AKIAAQACCT TTCACAGAAT CCTATQGATT GCAGCATTTC ACTTGOCTAC TTCATACCX3W 4140 



193 



wo 02/086443 

TGCCTTAAAG AGGGGCAGTT 
TCCTAACTCC ATTTCJAATGT 
TCTGAATTCC CATTTTCTTG 
GATCTTTCCC AAAGGTGTTG 



PCT/USO 2/12476 



GAGAATCAQC CMTTGGTAC 
ATAQAAAGGC TATGGATTGT 
AATAAAAAAA AAOGAATATT 
TTTAAAATGQ AGAGAAGTGG 
CTCCTAGGGA ATGATGAAAA 



TTCGCGGCTA 
ATTTACAAAG 
GCTGTGAGCC 
AAAAAAGATT 
TTAAGAACTA 
TGTACCCAAC 
ACAGATAAGG 
CAGCAGGCTA 



AGAAACATGC 
GGCCCCCAAT 
AATCSACAGTT 
AGGCCA6CTA 
AGGCAGGAGC 
TTTAAAGCTT 
TTTTAAAGTG 
AGCTASAAGG 
CCATTTAATA 



CGCCAGTTCT CAAGTTTTCC 
GTGGGGAGGT CCGRACATTT 
TCTGTCATTA CTTAGATTCC 
ATAGCAGAAA TCATQACCCT 
TCAGTATGGC AAAGOTTCTT 
TTATGTTATA CCATGGAGCC 
TTCCAGACCC AAAAAGGAAA 
ATTGCAAGGT AGATTTTTGT 
TATCAAAGAT CAGTTGACAT 



4200 
4260 
4320 
43B0 
4440 
4SO0 



21 



31 



41 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



] I I i I I 

MSSWIRWHGP AMARLWQFCW LWGFWKAAF ACPTSCKCSA SRIWCSDPSP GIVAPPRI.EP 

nsvdpetflte ifianqkrle iineddveay vglmjltivd solkfvahka fmssshuqki , 
kftrnkltsi. srkhfrhldl seiiiiivgnpf tcscajimwik tlqeaucsspd tqdlyclhes 
skniplanlq ipncglpsan laapnltvee gksitiiscsv agdpvphmyw dwailivskhm 
NETSHTQGSIj ritnissdds gkqiscvaen lvgbdqdsvn ltvhfaptit flesptsohh 

WCIPFTVKGN PKPALQWPyN GAILNBSKYI CTKIHVTNHT BYHGCIflUlH PTHMmGDYT 
I.1AKNKYGKD EKQISAHPMG WPGIDDGAKP NyPDVIYBDY QTAAHDIGDT TNRSMEIFST 
DVTOKTGREH LSVYAVWIA SWGFCLI.VM LPIiLKLAifflS KFGMKOFVLF HKIPLDO 



Seg ID HO I 19 DNA sequence 
Nucleic Acid Accession «: NM_000228 
coding sequences 82-3600 



31 



41 



51 



1 I 

CAAGGAAAGG TCCTTTCTGG 
TGTGTTTTGC CCTGCCTGGC 
ATCCACCTGT TGGGGACCTG 
QTGGACTGAC CAAGCCTOAG 
GCAAGTGTGA CTCCAGGCA6 
CATCCTCCGG CCCCRTG03C 



I 1 
GCTTTCAGGC GATCTGGAGA AAGAACGGC3V GAACACACAO 
GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT 
CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT 
CTTGTTGGGA GGACCCGGTT TCTCOGAGCT TCATCTACCT 
ACCTACTGCA CCC»GTATGG CXa«3TGGCRG ATGAAATGCT 
CCTCACAACT ACTACAQTCA CCGAGTAOAG AATGTGGCTT 
TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC 
TtCCAGCTTC AAGAAGTCAT GAIGaAOTTC CAOGOOCCCA 
GRGCGCTCCT CAGACTTCX30 TAAGACCTQG OaAQTOTACC 
ACCTCCACCT TCCCTCGGGT CCGCCAGGGT CGGCCTCAGA 
CACTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG 
ATGGATTTAG TGTCTGGGAT TCC3W3CAACT CfAhSXRMi 
ATCACAAACT TGAGAGTCAA TTTCACCAGG CTGGCCCCTG 
CCTCCCAQCG CCTACTATGC TGTGTCCC3M3 CTCCGTCTGC 
GGCCATGCIG ATCOCTGCBC ACCX»A6CCT GGGGCCTCIO CAGGCCCCTC CACOGCTGTG 900 
CaOCCAGCAC AACACTGCOO OCXXAASZTG TGAGCGCPaT 960 



vxxxxxxysa catgctgatt 4B0 

AGTACCTGGC 



ggaaggtcca 
aaattcaaga 
tgccccaaag 
agg6gagcto 



TCCOQACTQC 
TQTTCGQTQC 
ACTTAACCTT 



GGGCTACCAC 



GCCAGCCAGG 
AACTGTGAGC 
GAGACCTGCA 
CCaCTGACCG 



ACAACAACC6 GCCCTGQAGA COQGOSGAGQ 
ACTGCAATGG GCaCTC3VGAG ACATOTCACT 
GGGCATATGG AGGTGTGTGT GACAATTGCX: 
GQTGTCAGCT GCACTATTTC CGGAACCGGC 
TCTCCTGCGA GTGTGATCaS 
GGCAGTGTGT GTGCAAGGA6 CATGTGCAGO 
TCACTGGACT CACCTACGCC AACCCGCAGG 



eCCAGQACGC CCATQAATOC 



CGAAGGCAAG 
TTCCRTTCAQ 
TCCCTGT6AC 
TGACCTATCC 
CTGTQACTQC 



GCTQCCACCG 



AACATCCIGG GGTCCCG6AG GQACATGCCQ TGTGAOGAGG AGAQTGGGCG CTGCCTTTGT 



AGXGOCCAGG GCTOTQAACC 
CAACCAGTTC 
GCAGCCATCC 
TGTGACTGTG 
CTCTGCCGCC 



CAAATGIGAC CAQT6TGCTC 
GTSraCCTGC GACXXX3CACA 
GCCCTGTOGG GAAGGCTTTO 
AGACOJGACC TATGGAGACG 



GGGCTGGAGG 
ATCXXyVGCAG 
GCCATCCrcT 
GAGAC6TTGT 
ACTATQTATC 
GCCTTCCGGA 



CGGCAGGCGG 
ATGTCTTCGT 
ATGGCTTGCA 



AGGGCAGCCG 



GAGATCCAGG 
CAGGACATTG 
CATGCAGTGG 



TGCGCTTTGG 
ACCGTGGCCT 
TTCTCAGCAG 
CCCTCRGGCG 
CCCTTCCGAG 
AGAGGAAGAG 
TGCTGAGCAC 
GCXrrTTTGGA 
GAGGAGGAOG 
TGCCTGACCT 
CCCCAATATC 
GCTGCAGGGG 
AGCAGCTGCX3 
AGGAATCTGC 
GCCQCTCXCA 
ACTTCCTAAC 
TGGCCCTGTG 
CCATTGC3W3C 



CXSGGCCCOGC TGTGACCAGT 
TGCTTCCAGA 
TAGACTCCX3C AATGCCACCG 
GGCCTCCCX3G ATCCTAGATG 
CCCCGCAGTC ACAGAGCAGG 
AACTCTCCAG GGCCTGCAGC 
AGACCTGGAG AGTCTTGACA 
GGAGCAGTTT GAAAAAATAA 
AOCCTACQAG CAGTCAGCCC 



CCTACCACTG GAAGCTGGCC 
ACTCCCCTCA GCCC3VCAGTG 
GTGGCCTGAT GTGCAQCGCT 
TGGCCACAGG ATGCCOAGCX: 
ACAAS6CATC AGOGCGCTGC 



CCAGCCTGTG 
CAAASAGTAA 
AGGTGGCTCA 
TGGATCTGCC 
GAAGCTTCAA 
GCA6TGCTOA 



GTOMSGGCCT 
GATTGAGCAG 
GGTGGCCAGT 
CCTGGAGGAG 



GACACCCACC 
ATGCCCTGGT 
TGTCCTTCCC 
GGGCTTCAAT 
CTCACAGATT 
GATGQAGGAA 



TTCAACAAGC 
GAGCTATOTC 
AGGGCCGGTG 
GCCCAGCTCC 
CAATCCAGTG 
OATGTCAGAC 



TCTQTQGCAA 
CCCAAGACAA 
GGGCCTTCTT 
AGOSQACCAQ 
CECAGCXSCIT 
GCACACOGCT 



CTCCAGGCaVG 
TGGCACAGCC 
GATGGCGGGG 
GCAQATGATT 
GGAGACCCAG 
CXTTAATCCAG 



GCTGCCCACA GACICAGCIA CtGTTCTGCA GAAGATGAKT 



CRGGCroCCC 



AGGGTTGCTG 
AAGC3U3CTGG 
GGGGCAGAGG 



AGGGCCAGGT 
CTCAGQACAC 
AGGTTCAGCA 
GTGACTTCTG 
CAGTCCAQGC 
GATTTGAGAG 



GGAAGATGTG 
CATGCAAGGC 
GGTACTGOSG 
GACACGGATG 
CX3tf3CAGCTT 



AAC6TGQACT 
GCTGAGQCIG 
GTTGGGAACC 
ACCMCCGCT 
CCAGCAGAAA 
GAGGAGCTCC 
GOGGAAGGTG 
AACTATGCTO 



AGGAAGCC3U3 GAGCCQAGCC 
TGCGGCAGGG GACAGTGGCA 

cxxrrrcGGCT tatccaggac 

AGCTGGTGAC AAGCATGACC 
GCCACCAAGC CCGGCAGCAG 
CCAGCGAGCA GGCATTGAGT 
AQTT6AAGGA CCGGTTGGGT 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 

1800 
1860 
1920 
19S0 



2220 
2280 
2340 
2400 
2460 
2520 
2S80 
2S40 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



wo 02/086443 

CRGAGTTCCA TGCTGGGTGA 
GAGCTGTTTG GGGAGACCAT 
CTGCGGGGCA GCCAGGCCAT 
GTGGAGCAGA TCCX3TGACCA 
TQCTACAGCT TCCRGCCCX3T 
GATTGGGTTG GRATGCTTTC 
QACCACCCCT GGTQTQTAGC 
GGGACAGTTA CACTTGACAG 
CTCTCAA.GTC AAGGAAGCTG 
GGAATCCTGG ACCSiAGCACA 
AAAATCTTTG G 



PCT/US02/12476 



GCAGGGTGCC CGGATCCAGA GTGTGAAGAC AGAGGCAGAG 3430 

GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CATGCTGCGC TCGGCGGACC TGACAGOACT GGAGAASOGT 3S40 

CATCAATGGG CGCGTGCTCT ACTATGCCAC CTGCAAOTQA 3600 

TGCCCCACTC ATCTG<XGCC TTTGCTTTTG GTTGGGGGCA 3660 

CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

ACAAAGATGG TGGAGATTGG CATGCCATTG AAACTAAGAG 3840 

GGCTGGGCAG TATCCCCCGC CTTTAGTTCT OCACTGGGQA 39O0 

AAAACTTAAC AAAAGTGATO TAAAAATGAA AAG<XAAATA 3960 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



I 



11 



I 



MRPFFLLCFA IiFGIiUIAQQA 
EWQMKCCKCD SRQPHMYYSH 
MEFQGPMPAG M1.IERSSOPG 
HAKLHGGKVQ UILNDLVSGI 
VSQLRIiQGSC FCHGHADRCA 



51 



HYFRNBRPGA SIQETCISCC 



RTHRWQYIA ADCTSTFPRV 
PATQSQKIQE VGBITNUIVN 
FKPGASA6PS TAVQVHDVCV 
HSETCSmiPA VFAASQGAY6 
CDPDGAVFGA PCDPVTGQCW 
DMPCDEESGR CLCI*I!IWGP 
PCREGFGGLM CSAAAIRQCP 



UlASSTCGLT KPBTYCTQYG 
VKFVSIKJLDIi DRRFQI4EVM 
RCJGRPQSKQD VROQSLPQRP 
FTRLAFVPGfR SraPPSAVYA 
CQHNTAGPNC ERCAPFYNNR 



•cujauausix i.EEEn.si>pR 

AYSQSAQAAQ QVSDSSKUJ} 
TPTENKLCGM SRQMACTPIS 
GENAQLQRTR QMIHAAEESA 
DPnTDAATIQ EVSEAVLALW 
RI.QAEAEEAR SRAKAVEGQV 
VLRPAEKLVT SMTRQIiGDFW 
IKQKYAEIiKD KLGQSSMLGE 



lEQIRAVIiSS 



CKEHVQGERC D1.CKPGFTGI. 
KCDQCS^YHW KLASGQGCEP 
DRTYiarVATG CRACDCDFRG 
CHPCPQTYDA OLRBQALRFG 
PAVTEQBVAO VASAIIiSUtR 



CP6EI.CPQI»I 
SQIQSSAQRL 
LPTDSATVLQ 
EDWGNLRQG 
TRMBEiaiHQA 
QQARIQSVKT 
INGRVIiYYAT 



RLVRQAGGGG GTGSPKLVAL 
GTACGSRCRG VIiPHAGGAFI. 
ETQVSASRSQ MEEDVRRTRIi 
KMNEIQAIAA RLPNVDLVLS 
TVALQEAQDT MQGTSRSLRL 
RQQGAEAVQA QQLAEGASEQ 
EAEELFGETM EMMDRMKDMB 



MAGQVAEQUl 
IiIQQVRDFLT 
QTKQDIARAR 
IQDRVAEVQQ 
AI.SAQEGFER 
IiBIJLRGSQAI 



Seq ID NO> 21 I 
Nucleic Acid A< 
Coding sequence: 14S-] 



1 

TCGTTGATAT 
ACAGTACTGC 
AAAGAAAGTT 
CCAGAGGTTT 
ATTGACTTGA 
AGCATGGACT 
ACGAACCTGG 
AGTCCCTATA 
CCC»GCrCCA 
CCAGGCCCGC 
TGQACGTATT 
CAGATCAAGG 
AAAAAAGCTG 
GAATTCAACO 
CATGCCCAGT 



41 



CAAAGACAGT 
CCTGACCCTT 
ATTACCGATC 
TCCAGCATAT 



ACTTTGTGGA 
GXATCCGCaX 
GGCTCCTGAA 
ACACXGACCA 
CCTTCGATGC 
ACAGTTTCGA 
CCACTQAACT 
TGATGACCCC 
AGCACGTCaC 
AGGGACA6AT 
AT6TAGAAGA 
TTGGCACTQR 



TGAACCATCA 
GCAGGACTCG 
CAGCATGOAC 
CGCGCAGAAC 
TCTCTCTCCA 
OGTGTCCTTC 
GAAGAAACTC 



6GAGGTGGTG 
TGCCCCTCCT 
TCCCATCACR 



GGGCAAGTCX: TGGGCCGACG C 
AGQAAGGCGG ATGAAGAXAG 



GATGGTACGA 
AAACQAAOAT 
GAAATGCTGT 
ATTGAAACGT 



CCCCAGATGA 
T6AAQATCAA 
ACAGOCAACA 
GCTTGAGGAA 



TCTATATTTT 



CCCAACTGCr 
TTACAAGAAA 
GAACCACTGT 
GAAAGGGGCA 
AATTCACAGG 
AA AAAAG TTG 
GCCTTTTAAT 



AAGTQTQTGT 
TGIOTATCTA 
CAAftGGCACA 
GGATGTTTTC 



TTAAGATGTT 
GAAGCTTTTG 
TTArrGTCTQ 
OCTSGTCATG 



GCTGTGTACC 
CATQAAACCC 
CTCATTTTGT 
TGTTTACCAT 
AATTTGCTTA 
CTGATACTGT 
AGAOGTGITA 



TGGRAGACCT 
GCTTTTAATA 
TATTCAAAGC 
ATTAGAGCTT 
TCAGTGCATT 
AAATCAGCAC 



GAATTTTGAA ACTTCACGGT GTGCCACCCT £0 
TTTCGTAGAA ACCX3U3CTCA TTTCTCTTGG 120 
CAGAGCACAC AGACAAATGA ATTCCICAGT 180 
CTG6AACAGC CTATATGTTC AGTTCAGCCC 240 
GAAGATGGTG CGACAAACAA GATTGAGATT 300 
GACCTGAGTG ACCCCATGTG GCCACAQTAC 360 
CAGCAGATTC AGAACGGCTC CTOSTCCACC 420 
AGCGTCACGS CSCCCTCGOC CTACGCACAS 480 
TCACCCGCCA TCOCCTCCAA CACOGACTAC S40 
CAGCAGTCGA GCACGGCCAA GTCGOCC^CC 600 
TACTGCCAAA TTGCAAAGAC ATGCCCCATC 660 
GGAGCTGTTA TCCGCGCCAT GCCTCTCTAC 720 
AAGCXajTGCC CCAACCATGA GCTGAGCCGT 780 
AGTCATTTGA TTOSAOTAGA QGGGAACAGC 840 
GGAAGACSIGA aTGTOCTGGT ACCTTATGAa 900 
ATTCAOJACA GTCTTQTACA ATTrCATGTG TAACAGCAGT 960 

1020 

1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 

1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 

CTATCCCTCA AGCCTACCTA CX3VTAAAACC AGCCATATTA 2460 
2520 

TCCTGGACTG GAAATTAAAG ATTGAAAGGG TAKSVCTACFT 2580 



CATCAGAAAS 
TCGTCASAAC 
TGAACTOTTA 
AGAGTCCCTG 
GCAACAGCAQ 
TGAGCTTGTG 



6TTGTATTTC 
GCCCTCATAA 
AAGCC3VCFA6 
TGCAGATTTT 
GAGCTTTCTG 
TATTGGAACC 
AGCAGGTCTC 
TGCATAAGTA 
TAATAATATT 
ATC3VrrACCA 
TTGTGTCCTC 
ACTOTATGTT 
ACTACAAAAA 
GAAAGACAAA 



6CCCGQATCT OTGCTTGCCC 
CAGCAASTTT OSGACftOTAC 
ACACATGGTA TCCAGATGAC 
TACTTACCAQ TGAGGGGCXB 
QAACTCATGC AGTACCTTCC 
CaGCaCCAGC ACTTACTTCA 
6AGCCCCGGA GAGAAACTCC 
AACCQATCAG TGTACCCATA 



AOGAAGASAC 
AAAGAAOGGT. 
ATCXaTCAAG 
TGAGACXTAT 
TCAGCACACA 
GAAACATCTC 
AAAACAATCT 



TGAGAGAATC 
GTATCCTTAG 
TTGTTTCCTG 
CTTTTCTGTC 
AAACTTAAGA 
AGTTGTAGGT 
GCAAGTA6TA 



ACCGGCCATT 
GGAGGGRGGG 
TTCTTCTGTT 
TGTCTTTTTA 
QACTGAGAGA 



ACTCAAACCT 
GGTGGGTGAG 
GTCAGGTGGG 
GTTTTTCTAA 
AGAAAAGGA6 
CTCAGTCAGA 
GTGTCAAGTG 
TGGAGAGTTC 
ATTTCTTAAT 



AACTQTTGTT 
TCCACCCCAG 
ATTTGAAGCC 
AGCCTACCTA 
ACTTACGTTT 
GAAATTAAAG 



CTTACGTAGT 
ATCTGTGATT 
AGCCATATTA 
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T TC Wrrri ' T TACTCAAAAG TTTAOAQMIT CTCTGTTTCT TTCCRTTTTA AAAACATATT 2640 
TTAAGATAAT AGCATAAAGA CTTTAAAAAT GTTCCICOCC TOCATCTTCC CftCACCCAGT J700 
C»CCAGCACT GTATTTTCTG TCACCAAGAC AATGATTTCT TGrTATTQAG GCTQTTGCTT 27S0 
TTGTGGATQT GTGATTTTAA TTTTCAATAA ACTTTTGCAT CTTGGTTTAA AAOAAA 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



11 



31 



SI 



I I I I I I 

MSQSTQTHEF LSPEVFQHIM DFLBQPICSV QPIDLHFVDE PSEDGATMKI EI8HDCIRMQ 
DSDLSDFMWP QTimJBIAMS MDQQIQtlGSS STSFmTOSA C^ISVTAPSPY AQPSSTFDAIi 
SPSPMPSHT DYPGPHSroV SPQQSSTAKS ATWTYSTEUC KI.YOQIAKTC PIQIKVMTPP 
PQCSlVIRAMP VYKKAEHVTE WKRCPNHEL SRBFHEGQIA PPSHLIRVEG KSHAQYVEDP 
ITGRQSVLVP YBPPQVGTEF TTVIiYNFMCN SSCVGQMMRR PILIIVTLBT BDGQVLGKRC 
FEARICACPG RDRKADEDSI RKQQVSDSTK KGOGTKRPFR QNIHaiQMIS IKKRRSK>OB 
LLYIiPVItGRB TYEMMiKIKE SLELMQYLPQ HTIBXraQQQ QQQHQBLIiQK HLLSACFRNB 
LVEPRHBTPK QSDVFFRHSK PPNRSVYP 



Seq ID NOi 23 I 
Nucleic Acid Accession It: NM_001944.1 
Coding sequence I 84-3083 



: ASACXjaCTGQ c 



TTTTCTTAGA Ol 
mCACCAGG C» 

CCATCTT06T OSTGGTCKrA TEGOTTCATG QAGAATTGCS AATAGAiQACT AAAGSTCAAT 180 
ATQATOAAGA AaAOAXaACT ATGCSVACAAG CTAAAAQAAO QOVAAAAOGT SAATOGQTGA 240 



AATTTGCC3VA 
TTACTTCAOA 
ATCAGCCGCC 
CTATAGTCGA 
AAGGACTAGA 
ATCCXCCRGT 
ACTCACTG6T 
AAATTGCCTT 



ATTTTCACAA 
GATOATACTA 
CAAAATTGTC 



GAAS6AGAAG 
ACCCAGAAAA 
TTTGTTGTTG 
ACTtXAAGCT 
CCACTTATAC 
CAAATTTTCA 



TCACCTACOS 
ACAAAAACAC 
TCCTGATCAC 
TAACGGTTAA 
TGGGTGAAAT 



ATOGTCTGGT 
GTAATATTAA 

CAGCRCGTAT 
TGGATGAAC3A 



CTGAATTTCA 
AGGTAATftAA 
AAAAAGGCAT 
ATGAGGACAC 
GATACCTAAT 
ATTCTACTTT 



AOTGAAAOAT 
TGAAGAAAAT 
GTACACAGAT 
TGAAATACAA 
TTATGAACAA 
CCAATCAGTT 



ACTTTGACCA 
GCAQACAAAO 
OTCAAOJATA 
ATTTTAAGTT 
AATTGGCTTG 
ACTCATCCTA 
CTACAAAGCG 



AASaAACCCA ATTGCCAASA 
AATCrcTGGA GTGGGAATCG 
TGGAGATATT AACATAACAG 
ATGTCGGGCT CTAAATGCCC 
AATTTTGGAT ATTAATGATA 
TGAAGAAAAT AGT6CCTCAA 
ACXaA ACCA C TTGAATTCTA 
ACCCATGTTC CTOCTAAGCA 



CAACAGCTGT 
CTAOAACACT 
TAAAGTTGCC 
CCCAGGAACA 
ACAATCX3GTG 
GCATCTGTGG 



CTQGAGTTGQ 
GAGGAA OCAA 
TTTCTCAQAA 
TGTTGATCTA 



AAGTAQCAAA 
TAACAAAQCT 
GATTGATTCA 
CATAGTTAAC 
TTCTACAGGC 
CCTtXSAAAAA 
GAATAATAGA 
TGCOSTATGG 
QATACCTCCT 
TGAGATGCCA 
AACTTCTTAC 
GGGGCCTGCC 

CCCAGTTCCT 
TGAAGACAAG 
CATGGAAAGT 
TTCAGQAATG 
CTTTGCAACA 
CATCTGTTCC 



TTAAAAAACr 
CCTCTAAAQA 
CAGGATTTGT 
CTGGGTCTGT 
TAAOGGAGAC 
CACTTCTGAC 
CTOGCAACCT 
ATCCTTGCTC 
ATCTTTGGAC 



AGCATTTGCC 
TGAXAATGAA 
TATTQCIGAT 



GCCTCAAATQ 
AAAACT6CTG 
AAAACAATCA 
ACGGTATATG 
GATGCAOm 
TACACTGQCC 
AGTATCACAA 
GGAGTATACC 
CXSCAGCTTGA 
CCAACCACAA 
GCCATOGGCC 
ACCTGTGACT 
GATGGCTCAG 
GAAATCACAA 
TCTGAAGTTT 
GAAATGACCA 
OGGACAGTGX 
TCAGGGC3U3T 
GCTGATGGGG 



ACTTOCCSU^T 
CTGAATTACT 
CAGTATATTT 
GAACTAAT6A 
TGAAACTTAQ 
ACCGAGTTCA 
TCOGTCCTGC 
ATTATATCCT 
TCAAATATGT 
JVAATC3UUVTT 



ACTATCAACT 
GTTTAGASAC 
TGGATTTCAA 
CTTTACCTCT 
AGGCATCCTG 
TATTGCTGTC 
GTCAACXX:CA 
TTCCAAGACA 
GGGAACATAT 
CATGGGAC6T 
TGTCAAAAAT 



CAATGT6AAT 840 
TCTCAGTATT 900 



AAAQTGGTGA 
AAAAACAAAG 
GTCACAATTC 
TTTACTGTGC 
CAAGCCATCG 



TTAQAOTACC 
GCAGTTCTTC 
CCTATACATT 
CCCTCAATGC 
ACATCTCCCT 
CACTGGAAGT 
GCCCTGGGAC 



ACCTTCOGTG GTTGTCTOCG 
GATCAACCTG 
CTCCTCAGAG 
GACauSTCAGA 
GACAACAGGG 
AGGCCQCACT 
CIGCTGCTGT 



TACCTOSGCC 
GGTACTTACA 
CTGTCAGTGT 
CAGGTATGGC 
TGGTCTCCTG 



AAOG AACAA T TGATC BCTFOG GGAAnGAAG 
ATATTTOTOT GCCTCCIGTA ACAGCCAATG 
GTACAAATAC GCAIQOCASA 66CACA0OGO 
CIAAGCTTGG AGCAGCCACT GAATCTGGAG 
CAGGMGCTGC TTC3U3GATTC GGAGCAGCCA 
CTGGAACCAT GAGAACAAGa CATTCCACTQ 
GGATAAGCAT GAATTTTCTG GACTCCTACr 



CAGOGGTTAT 
TAAGTQCCAG 
CCAGCCAGCT 
TTACTCGGCT 
ACAAAATOTG 



TACCCCAAAA C 



ACTTTGTCAO 
GTTTCCATCC 
TCTGGTTCCC 
ATAGTGACAO 
ACGCAGCTAC 
TCACCAGAAX 
AAAATAQOVT 
CATAAACTGA 
CTCACTCCTA 
CTAAAATCAT 



OCTGTGGCCA 
GAAGTCAAGG 
CTGACCCTCT 
TCGTGC3VACC 
AAACGGTGAT 



GAGCIGQAAT 



TCCCaTAGAA 
AGCTTCTGCT 
GCSiOCATCGT 
TTCCACTGCA 
CTGTCCCftTT 
TACTATGCTC 
ACCACACIGA 



TTGTCOGCCT 
AACTATTTAG 
GGCTTTGATC 
TCCAGTGTTC 
TGTACA6AG0 
CCAAATCTGG 
GCTAATAATT 



TCAOGATTAT J! 
ATTCTOUVGT Jl 
ATTCBC 



\ TTGTAOTAAA 



1200 
1260 
1320 
1380 

ISOO 
1560 
1630 
1680 
1740 



1920 
1980 
2040 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
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I I ■ 

MMGLFPRTTG ALAIFWVIL 
QEDNSKRNPI AKITSDYQAT 
PSFLITCSAL KAQGLDVEKP 
ATDADEPHHL NSKIAFKIVS 
DKDGBGLSTQ CECKIKVKDV 
WIAVYFFTSG NEGNWFBIQT 
SRTOVQOTPV TIQVIKVSBG 



PCT/US02/12476 



WVRVPDFND NCPTAVLEKD 
ITTLWATSAL LRAQEQIPPG 
TTSPGTRYGR FKSGRLGPAA 



I 

VHGEa.RIETK 
QKITYRISGV 
LILTVKIIiDI 
QEPAGTPMFl. 
NDNFPMFRDS 
DPRTNEGILK 
lAFRPASKTF 
TAEIKFVKNM 
AVCBSSPSW 



I 

GQTOEEEMTM 
GIDQPPFGIP 
NDNPPVFSQQ 
LSENTGEVRT 
QYSARIEEMI 
WKAIiDYBQL 
TVQKGISSKK 
NFDSTFIVIIK 
VSARTLNiray 
SQNHSCEMPR 
LIiIiAPtiLIaLT 
ANGADFMBSS 
AATQVGICSS 
DCLLIYDNEG 



IF^4aEIBBHS ASHSLVKILN IBO 

LTKSLDREQA SSYRLWSGA 240 

l^SELLRFQV TDLDEEYTON 300 

QSVKI^IAVK NKAEFHQSVI 360 

IiVDYIIiGXYQ AIDBDTMKAA 420 

TITAEVIiAID EYTGKTST6T 480 

TGPYTFALED QPVKIiPAVWS 540 

SLTLEVOQCD NROICOTSYP €00 

CDCQAGSTG6 VTCQFIPVPD 660 

EVCTDTYAKG TAVBGTSGME 720 



I YLVTETYSAS 



ADATGSPVOS VQCCSFIADD 
lESOGHFIEV QQTQFVKCQT 
GSLVQPSTAO FDPLLTQNVI 



Seq ID HO: 25 DHA sequence 

Nucleic Acid Accession #t Eos sequence 

Coding sequence: 56-1642 



AfiTATCCCAG 
GCAAGGGATC 
CATGTTTaAQ 
CTOCTCTGTC 
GGAGAAGQTG 
GGAAQATCAG 
GGACTCTTTT 
CTTTTCCCAG 
GQAGATGGTA 
TAACTCAGGQ 



I , 
' GAGGAGCAAG 
CTTTCTCCGC 
TCCACAGCTG 
GTCTCTACCT 
AAAGTATACT 
GGTTGTOTCC 
GCCCTGAAGA 
ATCTTTGGGC 
AAGGATGTAC 
AAAACCCACA 
CTGATCTTCA 



21 



31 



I I 1 

TGGCACGTCT TCGGACCTAG GCTGCCCCTG CCGTCATGTC 
CAOCGGGCTT OCTGTCCGAT GACGATGTCG TAGTTTCTCC 
CAOATnGGG GTCTC5TGGTA CGCAASAACC TOCTATCAGA 
CCCTAGAGGA CAA6CAGCAG 6TTCCATCTO AGGACAOTAT 
TQASQGTTAO aCCCTTGTTA OCTTCASAST TGBAACGACA 
GTATTGAGAA TGTGGAGACC CTTGTTCTAC 
GCAATGAACQ GGGAATTGGC CAAGCCACAC 
CAGAAGTGGG ACAGGCATCC TTCTTCAACX: 
TCAAAGGGCR GAACTGGCTC ATCTATACAT 
COATTCAAGG TACCATCAAG GATGGAGGGA 
ATWKXrrCCA AGGCXaACTT CATCCaACAC 
XAATCTGGCT AGACAOCAAG CAGATCOOAC 



ACSVGGTTCAC 
TAACTGTGAA 
ATGGA6TCAC 



GftSQAGIGTC 
TGGOCTCTCT 
ATGGGCACAG 
GATCTCATTC 
ACAGCGCAAG 
AGATCTCAAC 
TCGTAAGAAC 
CATCTTCTCA 



TACATOQAAA 
TCTATCaGTC 
CCAGAC3\CT6 
TTTGAGATCT 
AGGCAQACTT 
TGGATTCATG 
CAGAGCTTTG 



6TCGGATAGG TACCSGCACX: 
AGTGTACC3VG CAGTAGCCAG CTGGATGAAA 
CCCCftCTACC TGTCCCGGCA ARCATTGGCT 
ACAACGAACT GCTTTATGAC CTATTAGAAC 
TGCGGCTATG CGAGGATCAA AATGGCAATC 
TGCAAGATGC TGAGGAGGCC TGQAAGCTCC 
CCAGCACCCA CCTCAACCaUS AACTCCAGOC 



CTGATCTGAA 

CTTOCTTOAA 
TGQCATTGC 
CAAGTCATOG 900 
TCTCCATCTG 960 



CGCCTAGCCA 
CCTATGTGAA 
TAAAAGTGGG 
GCAGTCACAG 



S OAAGCAaaAA ACATXAACAC CTCTCTACAC ACCCIG 



CGAGTGTTCC 
CCCTGTGCAT 
CAGGTGACTT 
CAAGGAACAT 
CCTTGATQAT 
ACAAGTTGTQ 



AAGGTTTCTT 
CTACCTATGA 
GTGCATGCCX: 
AGTCTTCAGG 
GATATTGAAA 



CCTCAGCCAA 
TAAGQCTGTT 
CTTGTTGCCA 
TCTTAATCAA 
TGGACCTTOG 
AAGGCCAGGT 
AACAACCACC 
AAAGCTCAAC 
TCAAATCTGG 
GGCCCTGAGO 
TATCAGGAAT 
TATAACCACC 



ACAOTGOXGC 
AAAACTAAAT 
GOATGAAAAG 
CCATCAG(aA 
CTCCACCC3VG 
CTCTACCACT 
GCCCTTCACC 



TTCTQCCAAA 
AGGGAAGAAA 
AGACTGCAGC 
GCCTTTTGGC 



TATATCCAGQ 



CTOOGAOATG 

aqtoaacatt 
atcx:tcaag6 
attgaagagc 
tcagggtctg 
cagcttcagg 
gaagagttgc 
attgatgtgg 

CTTCAGAAAC 
GCAGGAAAAC 
ACTCTGGCTG 
GCATGTATTG 
AAGCSSCCTTG 
CCATTCCTTC 
CCTTATGCCC 
AAAAAQTACT 
ACTCTCCTGA 
ATGCAATACT 



ATGAAGCIGA 
AGACACTQCT 
AAATTT6CAA 



AGTCACTGAC 
TAGAAGCTCT 
AATTGGCCCT 
AGGTTAAAGC 
ATAAGTATCA 
ACAAGAAGTT 



G6CCGTTCCT 
CATGTGGCCA 
ACTGGGArTC 
CTTAGAGAAA 
CATCTCCATG 
TTTGAAGGAA 
TGAGATGOTA 
AAAGGAACTA 
AAGTTTTTAC 
CTTGCAGGAA 
ACGGCGGTCA 
TAAATTACAG 
GAAAAXGTCA 
AGAAGAGGGC 



CCATCCCTGC 
GGGGCTAAGG 
TATGGCAAAG 
CGAC3VGGAAA 
GAACAGATGC 
TTGGAGGAAA 
CAAGAAGAGA 
GCCAGACAAC 
CAAAGGTTGG 
CABTGCAAAO 



CAGAAGAATA 



TTCGTCAAGC 
AACTGCa«3AA 
CTGAGCAGTA 
GTACCSkACCa 
GAAATTTACT 
GGATOCTAOB 
AAGQCTGTQO 



CTTGACCACT 
CAACATGGTQ 
TCATACTGT6 
GGAAAATCAG 



h AATTATAAAA GGGACAGAAA J4 



CITTTTTCrC 
TrXACTTATA 
TTTTTTATTa 



TGTGATGACA 
CTAGTGAAAC 
TTGAAACTCC 
CAACCAAACC 
CCAACCTGCC 
TCCCCTTTAC 
OAGCftGTCAT 
CTTTACCATA 
ACTTTTQIAT 
TQAmCTAT 
AATTCCAAAT 



1020 
1080 
1140 

1260 
1320 
1380 
1440 

1500 
ISSO 
1620 
16S0 
1740 
1800 
1860 

1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 

2sao 

2640 
2700 
2760 
2820 



21 



31 



41 



51 



I I I I I 

MSQGIIiSPFA GLIiSODOVW SPMFESXAAD IiGSWRKNUj SDCSWSTSb BDKQQVPSBD 
SMEKVKVYLR VRPLLPSELE RQEDQQCVHI HJVETLVLQA PKDSFALKSN ERGIGQATKH 
PTFSQIPGPE VGQASFPMLT VKEMVKDVLK GQNWLrXTYG VTNSGKTHTI CJGTIKDGGIL 
PRSLALIEMS IiQQQIiHPTPD LKPLLSHEVI HUSSKQIRQB EMKEOaSIiIiNG GI1QEBEI.STS 
UOISWIBSR ZarSTSFDSG lACLSSISQC TSSSQLDETS HRWAQPDTAP LPVPAHIRFS 
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IWISFFEIYM ELLYDbliBPP SQQRKIiQTI.R LCEDQHGNPY VKDUmiHVO DAEBAHKIiLK 
VraUCHQSFAS THIiNQNSSRS HSIFSIRII>B IiQGEGDIVPK ISELSU33LA GSBRCKDQKS 
GERLKEAGNI NTSLHTLCRC lAALRQNQQH RSKQNLVPFR DSKLTRVFQG PFTCR6RSCH 
IVNVNPCAST YDETUIVAXF SAIASQVTC& CPTYATGIPI PALVUQGT 

Seg ID NO: 27 vm. sequence 
NUcletc Acid Accession #: r 
coding sequence: 13-1424 



PCT/US02/12476 



\ CAATQAAQTT 



GQAAACTTAA 
QGGCAACTGG 
GTCCATCATT 
AGAATCAATA 
GCTTTCCAAO 
GCTGACATTT 



TTTATGGCCT 
TGAAGGAAAA 
WryiCATCTAC 
TCAGGGAAAT 
ATTACACACC 



21 
I 

TCTTCTAATA 
AAGCCTGGAA 
TGAGATAAAC 
AATCCAAGAA 
CCTGGAGATG 



31 
1 

CT6CICCT6C 
AAAAATAATG 
AAACTTCCAG 
ATGCAGCACT 
ATGCACGCAC 



AGGCCACTGC 
TCCTATTTGG 
TGACAAAAAT 



TGACAT6AAC C 



} ACQAATTCTG 
i TTGGCCATTC 
ACAAATATGT 
CCCTGTATGG 
CTCTCTGTQA 



AOIGTTAATT TAATTTCTTC 



6ACTACACAT 
CTTAGGTCTT GGCCATTCTA 
TGACATCftAC ACATTTCGCC 
AGACCCAAAA GAGAACCAAC 
CCCCAATTTG AGTTTTGATG 
CAQGTTCTTC TGGCTGAAGO 
CTTATGGCCA ACCTTGCCAT 



GGAAACATTA 
TTGACTACGC 
GCAAGATTAA 
ACTTCC3VTGC 
GCATTOSAGa 
CAAACTTGTT 
GTGATCCAAA 
TCTCTGCTGA 
GCTTGCCAAA 
CTGTCACTAC 
TTTCTGAGAO 



51 
I 

TTCTG6AQCT 
TGAAAGATAC 
GAAATATAGT 
GAAAGTGACC 
AGTCCCCGAT 



QAAATTQAAG 
AATTTAAGAC CAOAG 
GIGAAAAAAA 
GATAACCAQT 
CTGATTACCA 
AACAAATACT 



CX»6AAATCA AGTTTTTCTT TTTAABOATG ACAAATACTQ 



\ TTATCCC3UU3 



GTGTACCACT 
TTATATAAAA 
CTCTACTATT 



ATTGGAGGTA 
AGAACTTCCa 
ACTATTTCTT 
AAACACTQAA 
TAGTTCACTT 
ACTTAGAOAT 



TGTTTTTAAC 
TGATGAAAGG 
AGGAATCGOG 
CCaAGQATCT 
AAQCAATAGC 
CAGCTTAATA 
ATGTATCATA 



AAGTTTGAAA 
TQCTTCCTAA 
TATATATATT 



ATAGTTACCT 
CATCCTTG6A 
TTOGCTCAAA 



AGCATACATT 
CCAOOTTTTT 
AGACAGATGA 
CCTAAAATTG 
AA CCauv rTTG 
TGGTTTGGTT 
AGTATTTATT 
AAAATAAAAT 
TOAAAACTCT 
TCAAASCAA8 



A7GCAGTCTT 
AATATOACTT 
GTTGAAAATG 



CTGTAAACCA 
AATTGTCCAT 
ATAATTCTAT 
ArACTTACTT 



AATCCGGAAA 420 
CACAGGCATG 480 
TTTTGATG6C 540 



TGACATACGT 780 
TCCTGACftAT 840 
CGTGGGAAAT 900 
ACCAAAGACC 960 
AGCTGCTTAT 1020 
QTTAATTAGC 
TOCTAACTTT 
CTTCTTT6TA 
TTATCCCAAA 
CTACTCTAAA 
CCTACTCCAA 
GTGTAATTAA 
TATGTCCTCA 
TAGGTAATGA 
TCTTGCTTGA 
TTGAAGCATG 
CTGGCATAAC 



1320 
1380 
1440 

15£0 
1620 
1680 
1740 



31 



41 



I I I I I I 

MKPLLII1LI.Q ATASGALPUI SSTSt.EKNNV liFGERYIiEKF YGLEINKLPV TKMKTfSGHLM 
KEKIQEMQHP LGrJWTGQLD TSTIdEMMHAP RCGVPDVHHF KEMPGGPWIR KHYITYRINN 
YTPDMNKEDV DYAIRKAFOV WSNVTPUCPS KIimWADIIi WPARGAHGD FHAFDGKGQI 
liAHAFGFGSG IGGDAHPDED BFWTTHSGGT NLFLTAVHEI GHSIiGLGHSS DPKAVMFPTY 
KYVDINTFRIi SADDIRGIQS LYGDPKHIQR LPNPKISEPA LCDPKLSPDA VTTVGNKIPF 
FKDRFFHUCV SERPKTSVNI. ISSLMPTLPS GTBAKiBIER. RIIQVFLFKDD XYHLISNIAP 
EPNVPKSIHS FGFPNFVKKI DAAVFNPRFY RTVPFVDNQV HRyDBRRQHM DPGXPKbXTK 
NFQGIGPKID AVFYSKNjacY YFFQGSNQFB TOFI.MJRITK TUCSHSWFGC 

Seq ID NO: 29 UNA sequence 

Nucleic Acid Accession «: 11M_006115.1 

Coding sequence: 236. .1765 



11 



21 



31 



I I I I I 

GCTTCAGGGT ACAGCTCCCC CGCAGCCAGA AQCOGGGCCT " 
CGGGACACCC CACXXJGCTTC CCAIS3CGTGA CCTGTCAACR 
ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC 
GAGACCTAGA AATCCAAGCG TTGGAGGTCC TGAGGCX3U3C 
ACGAAGGCGT TTGTGGGGTT CCATTCAQAG COGATACATC 
CCCACX3SASA CTTGIGGAGC T8SCAGG6CA GAGOCTGCTG 



GCAQCCCCTC AGCACCGCTC 



AASQAXOASG COCTGGCCAT 360 



TCIGGGAGT6 CIOATGAAGG GACAACATCT TCAOCIGSAO ACCTTCAAA8 



GGATTTACGG 
TCTOTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 
TATCAA6ATG 



GTGCTCCTTG 
AAGAACTCTC 
TTTCX3VGAGC 



GGTGCCTGTQ 
CIAOGCCT6T 
ATCCTOAAAA 



CCCAGQAGGT 
ATCAGQACTT 
CAGAAGCAGC 
AQCAGCCCTT 



GCTGTAAGAA 



GCAGTATATC 
TGTGGACTCT 
CCCCTTGGAA 
QTCCCAGAGT 
CGATGTAAGT 
CCTGGTCriT 



CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 
ACCCTCTCAA 
CCCAGCGTCA 
CCCX3AGCCCX: 



TTAGAGGCCG 
TAACTAACTG 
GTCAQCTAAG 
TCCAAGCTCT 



TCGCKCCAGG 
CT6GACTGTA 
TCAGCCCATQ 
CATTCCAGTA 
CTCCTACCTC 
GCTOAAQATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCrCAGTCTG 
CCTGGATCAG 
CCGGCTTTOG 
TGTCCTGAGT 
GCTGGAQAGA 
TGATCAGCTC 



TGGTCTGGRA 
ACAAAGAAGC 
GAGGTGCTCG 
ATTGAGAAAG 
TTTGOVATGC 
GAAGATTTGG 



CCTGCCTCCC 
CTGTGCTTOA 
TTCAAGTGCT 
ACAGGGCCAG 
QAAAAGTAQA 
TAGACCTGTT 
TGAAGCGAAA 
CCATCCAQGA 
AAGTGACTTG 
TQATTAATCT 



CAOTGCCTGC 



CTAAGTGGGG 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 



7166CTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCaTGCTGAC 
CCCTCCAGOA 
TQCCTTCCCT 
CCATATCTGC 



1380 
1500 



198 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
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TQCAOAGT CTCCTGCAGC 



ACXTTCATCGG 



TAOTCGCAAC CCCTaTCCTC 
GTGCCCCTGT TTCATGCCTA 
TTGGACACTA AAGCCAGGAT 
ACAAATGTTC AGTGTGAGTG 
GTTCAGTGAG GAAAAAAAGG 
GTOATCTTTG GGGAGATACA 
GATTCTGGCT TGGGAAOTAC 
TGTT6AAAAT AAAGAGAAGC 



ACTGTG06GA 
ACTAGCTGGG 
GTGCATGCAT 

GGAAGTTGGG 
TCTTATAOAG 
ATGTAGGAGT 
AATGTGAAGC 



GCTGAGCAAT 
TGGTACCCTC 
TOAOTTGOaO 
CAGAACCTTC 
TGCAC»TATC 
CTTGAAGCAA 
GTTCAQTGAG 
GATAGGCAGA 



CTGACCCACG TGCTGTATCC 
CftCCIGGAGH GGCTTGCCTA 



AAATGCTTCA 
CAAAGCAGCC 
GAAAAAACAT 
TGTTGACTTG 
AATCTGAATT 



V AMAAAAA 



AGCCCATCCT 
TTCTGCATAC 
ACAGTTTCAG 
TCAGACAAAT 
AGGAGTTAAT 
TCTAAAOGGA 
GTAAA6AAAC 



1560 
1630 
1680 

1800 

1920 
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Seq ID NO: 30 Protein sequence: 



GCTTCAGGGT 
CGGGACACCC 
ACTCTCIGAG 



aogaaggcgt 
ccx:acggaga 
tgccgccctg 
CGGGAGACAC 



TGGACTTGAT 



AATCCAAGCG 
TTGTGGGGTT 
CTTCTGQAGC 
QAGTTGCTGC 
AGCCRGACCC 
CTGATGAAGG 
gtgctccttg 
AAGAACTCTC 



I I I 

AGCCGOSCCT GCRGOGCCTC AGCACCXSCTC 
CCXGTCAACA GCAACTTCQC GSIQIGGTGA 
TTTGATTATT ACTCTCAOAC GTGOSTQGCA ACAAQiaACT 



CCTCAAGGAA 
GAAAAATOTA CTACQCCTQT 
TATCAAGATG ATCCTGAAAA 
TACCTGGAAG CTACCX:ACCT 
GCGTAQACTC CTCCTCTCCC 
GCAGTATATC GCCCAGTTCA 
TGTGOACTCT TTATTTTTCC 



CCATTCAOAO 
TGGCA.GGGCA 
CCAGGGAGCT 
TGAAGGCART 
GACAACATCT 
CCCAGGAGGT 
ATCAGQACIT 



TGGTGCAGCT 
TGGCGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTASAQGCCG 
TAACTAACTG 



COOATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
GGTGCAGGCC 
TCACCTGGAG 
TCGCCCCMSa 



AGCATOAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAAG 
AGGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 



TGTGGACAAG 
CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 



CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCTCAGTCTQ 



ATTGAGAAAG 
TTTGCAATGC 
GAAGATTT06 



ATTTCCCCGG 



OQATGTAAST 
CCTGGTCTTT 
GAGCCACTGC 
CTTGCAGAGT 
TGTCCCCCTG 
TCTGCATGCC 
TAGTGCCAAC 
GTGCCCCTGT 



: TCCAAGCTCT 



ACAAATGTTC 
GTTCAGTGAG 
GTGATCTTTG 



TTCATGCCTA 
AAGCCAGGAT 
AGTGTGAGTG 



CAACCTTAAG 
: ACCTCATCGG 
3 AGGACATCCA 
} AGTTGCTGTG 
: ACTGTGGGGA 



CCGGCTTTCG 
TGTCCTGAGT 
GCTGGAGAGA 
TGATCAGCTC 
CTTCTACGGG 
GCTGAGCAAT 
TGGTACCCTC 



GTGCATGCAT 
AGGAAAACAT 
GGAAGTTGGG 
TCTTATAGAG 



CAGAACCTTC 
TGCACATATC 
CTTGAAGCAA 
QTTCAGT6AO 



TOTTOAAAAT AAAQAOAAGC AATCrTGAAGC A 



TTGCTCAGGC 
GAAGGGGATQ 
CTAAGTGGGQ 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 
CTGACCCACG 
CACCTGOAOA 
CGGCCCAGCA 
TATGACCCGG 
AAATGCTTCA 
CAAAGCAGCC 
GAAAAAACAT 
TGTTGACTTG 
AATCTGAATT 
GTAOACTOTT 



GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATQCTGAC 
CCCTCCAGGA 
TGCCTTCCCT 
CCATATCIGC 
TGCTGTATCC 
GGCTTGCCTA 
TGGTCTGGCT 
AGCCCATCCT 
TTCTGCATAC 
ACAGTTTCAG 



AGGAGTTAAT 
TCTAAAG6GA 
GTAAAGAAAC 



CTGACCCTCG TGATCTTCAG 
CCTTCTAAAC TAGAGGCAGA 
TCTOCAOACC TCATCCGGTC 
TACACAGCCA GGGCT6TTGC 



CCGGCGCTCC GTGCGCGGAG 
TCGTGATGGT GAAGCCTGCA 
CAAAATAATT GGCAGAGTTA 
AAQTGATCCT GATTTCAGAG 
QCTGTCTGAT AAGAAAAGAT 
GAAAGAGGTT ACTGTGCTGC 
JTOTT 



41 

1 

TOCTGGCCCT 
COOTCTGCCT 
AAAAGGTGAT 
ATTTGGAAGA 
TTCTAAATGA 
CATTTACCAT 
TAGAACAXCA 



1020 

loeo 

1140 
1200 
1260 
1320 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



GCOCGGCATC 
GCATCTGCTG 
ACTTAATGTA 
GTGCTTCAGG 



75 
80 
85 



GAAATCTATT TTGCACTCGG 
CTTATGC6TC AACTGCAGAT 
TAGAGGATGA AAATQACAAC 
TGGAAAGTAG TAGACCTGGT 
CGGACACAAT GCATACGCGC 
GGCTCTTTTC TGTGCATCCC 

GAGAGGTTGT AGACRACTAC 

TCATTOATAA TGAAA6TACA AOACATaOAT GGCCAGTTTT TTGGATTQAT AGGCACATCA 



GGATATTCAG 
CACCCTGTTT 
ACTACAGTGG 
CTGAAATACA 



TAAATTTGTT 
GTGAAGAATA 
CAGATCTGCC 
TCACAGAAGC 



TTAT ATAGAA A GAGAC ACTO 
TGATGTTTTT GATTTGATTG 
CCTCCCACTA CCCATCAGGG 
AATTTATAAT TTTGAAGTTT 
TGCCACAGAC AGAGATGAAC 
GCAGACACCA AGGTCACCTG 



ACTTGTATCA TAACAGTAAC AGATTCAAAT GKTAATGCAC 



GATAAGGATT TAATTAACAC 
GAAAATGGAC ATTTCAAAAT 
GTAAAGCCAC TGAATTATGA 
GAAGCGCCAT TTGCTAGAGA 



ATTAAAGAAA ACTTAOCAOT 



AAATGCATTC 
TGCGAATTGG 
CAGCACAGAC 
A6AAAACCGT 
TATTCCCAGA 
TGAGGGGCCT 
GGGGTCAAAO 



AATGTGGAAA 
AGAlSTCAATT 
AAAGAAACTA 
CAAGTGAACC 
GTOACAGCCT 
GAATGCACTC 
ATCAAGQGCT 



OCACTTTCAQ ACAAAATGCT 
TCTTAOGAAT ACCTATAQAA 
TTACCATTTT AAAGGOAAAT 
ATGAAGGTGT TCTTTCTGTT 
TGGAAATTGG AGTAAACAAT 
TGAACAiGAGC CTTGGTTACA 
CTGCAGCCCA ATATGTGCGG 
ATAAGGCATA TGAGCCGGAA 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 



199 



wo 02/086443 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTOGATCftCC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCX3UVAATCC TGGATAGGGA GGTTGAAACT 1680 

CXXAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAQ ACAAAGATGA TAGATCATQT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

5 GAATATGTAG TC3VTTTGCAA ACCAAAAATG GGGTATACC3G ACATTTTAGC TGTTGATCCT 1860 

GATQAACXna TCCATGQAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAAJC 1920 

A6I3U3ACIGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCIGQA.T TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTOSTGCG 2100 

10 ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCnTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

QGOAAACXSTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAQTQTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAASGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 24 00 

15 GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATC3VT 2460 

ACCCTGGACT CXTTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT TGCATOGATG TAATCAGAAT 2S80 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAGfcBGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAGGAAG AAGATGQCCT TGACTTTTTA 2700 

20 AATAATTTGG AACCCAAATT TATTACATTA GCAGAAOCAT GCACAAAGAG ATAATQTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGOAGOnT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATGAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTCC TTATCTTTTC CAAAAAGTGA AAAATGTTAA -2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAQCATCTGC 3000 

25 TCTTTTTTTT TTTTACGGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATAGCTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTOCACT 3240 

ACCAAATTCA TTTGACTTTG GAGGCAAAAT GTOTTQAAGT GCCCTATOAA GTAGCAATTT 3300 

30 TCTATAGGAA TATAGTTGGA AATAAATGTQ TBT G TOTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTQTCX: TACAATAGAA AAAACAGACA GCTTCCTAGG CXrTGGGCTCT TAAATGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAATAGTTC CIGTCXyuVTT TGTGTAATTT GTTTAAAATT 3540 

^_ GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGGAAA TAGGGAATCC AATGGAACAG 3600 

35 TAGCTTTGCT TTGCAGTCTG TTTCSkAGATT TCTGCATCCA CAAGTTAGTA GCAAACTGGG 3660 

GRATACTCGC TGCAGCTGGQ GTTCCCTGCT TTTTGaTAaC AAGGGTCCAQ AGATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAQTTAAATC 3780 

CTCTATTGCT GTTrCTATTC TCTCTTATAQ TGACCAACAT CTTTTTAATT TAORTCCRAA 3840 

TAACOVTOTC CTCCTAGAGT TTAQAOGCTA 6ASGGA6CIG AOGaGAGGAT CTTACTGAAA 3900 

40 GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

QTCTGGGAGC TAC34AAATTT CATTTTTCTC CTCACTGCCC TTCTTCTGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAQCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTrrTAAT CAGTTTGCTT TCTCCAGAGA AATTTTAAAA TAATAQAAGA AATAGAAATT 4200 

45 TTGAATGTAT AAAAQAAAAA GATCAAGTTG TCATTTTAQA ACAGAQGGAA CTTTGGGAGA 4260 

AAGCAGCCCA A6XAGGTTAT TT6TACAGTC AGAGGGCAAC AGGAAGAXGC AGGCXTTTCAA 4320 

GGGCAAGOAiQ AGGCCACAAG OAATATGGGT GGQAOTAAAA 6CAACATCGT CTGCTTCATA 4380 

CTTTTTCCTA GGCIT6QCAC TOOCTTTTOC TTTCTCAOGC CAAT6QCAAC TGCCATTTGA 4440 

GTCOGOT6AQ GGATCAOCCA ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

50 AAGGAGACAG AGCTGACTGC ATGATGAGTC TGAAGGCATT TGCAGGATGA GCCTOAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGXA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAQGTCrA TGAATTAAAT GCCTATCTAA AATTCTGATT T ATTC CTACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCOCC TTTTTTTTTG AGACGGAGTC TCGCTCTGAC GCACAGGCTG GAGTGCAGTG 4800 

55 GCTCCGATCT CTGCTCACTG AAAGCTCCGC CTCCGGGGTT CATGCCATTC TCCTGCCTCR 4860 

aCCTCCIGAG TAfiCTGGGAC TACAGGCGCC CACCACCAOG CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAOAa AOGGOSTTTC ACTaTGTTAG CXaVGGATGGT CTCGATCTCC TGACCTCGTG 4980 

ATCCGCCTGC CTOOGCCTCC CAAAGTGCTG GGATTACAGG CATGACCCAC CGCTCCCGGC 5040 

CTTGTTTTCC GTTTAAAGTC UTC 'X'i'CX'lTT AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

60 TGATCATACG AATTGGATCA ATCTTGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAGG 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTO QTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTGCTGflAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACGGCTAC TGAAACACCC 5280 

ACTGTGTTTT GCTCACTCCC TCACTCACXX3 ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CXAGTGCOSA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

65 TAACCATCTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCX!GAA 5460 

TTTGTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CASjGAAAATA TATTTTTAAA C3CTTTCATTT TTCCCKCAGT GAATGATTTA GAATTTTTTA 5640 

TGXAAATATA C3VGAATGTTT TTTCTTACTT TTATAA6GAA GCAGCTGTCT AAAATGCAGT S700 

70 GGGGTTTGTT TTGCAATGTT TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTZACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATQCAXA AAGTAATATT SB20 

TACAGATOTG GGGAGATGTA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTQTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 594.0 

AATAGAAATA CTCAATTATG TCTTTGTTQT ATTAATGGGG AATATTTTGG ACAATGTTTC 6000 

75 ATTATCAAAT TOTCGACATC ATTAATATAT ATTGTAATGT tGGGAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAQATG AGTATCTATG ATACATATGT ATAATAAATT TTGATCGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATAGT ACACTGACAC 6180 

AGGGGTTTTA CTTTGAGGAC CAGTGTAOrC AAQGGAAAAC ATGASTTAAA A AGAA AAGCA 6240 

_ GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA avCTTTAATQA 6300 

80 CAAGATGATC CAACCATAAA GQTQCTCTGT GCTTCACAGT GAATCTTTTC CCCATCCAOa 6360 

AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AGCCTTACAT TTTAATATAG GTTQAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT TTTGTTTGTA ATAGTAAAAT 6540 

_ ACCGGATACA TTTCACGTOT CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 6600 

85 TQAGAAGCAT GGACACTAGA GCCAGAATGC TTGQATATGA ATCXTTOGATC TGTCACXTAC 6660 

TTCTOTGTGA CCTTTQAAAQ GCTWrrrATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

OAACAATQOC AGCCTCATOQ GGTTOTTGAA TCATTAAATT ABTTAATAXA CCIAAAGTAC 6780 
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ATAGAACACI GCCTGCACA.T AQTAAAA6AA TTATAikOTCT GAGGTAGTTG 6TAAAA.TTAT 
GTAOTTGOAr ATACTACCQA ACAATATCTA ATC 
TATATATAAT CCCGAAACAT G 



AGGGAAATAA AGTTTCXGCA 



PCTAJS02/12476 



10 
15 

20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



MAAAGPRRSV 
ADLIRSSDPD 
KTRHTRETVI, 



11 
I 

RGAVCLHLLL 
FRVLNDGSVY 
RRAXRRWAFI 
DraNLFCTRP 
EVLBSSRPGT 



I 



EAPVBENAFH VEILRIPIED 



NGYKAYDPBN 
AIDKDDRSCT 
SIiPtrrSPBIS 
CTHPTQCaiAT 
NIiIIStlTBAP 



TARAVALSDK 
PCSMQENSIiG 
VDREEYDVFD 
TVGWCATDH 
LIKKVQDMDG 
KDLINTANWR 
APFABOIPRV 



GTLAVMIEDV 
RI1WSI.TKVMD 
SRSTGVILGK 
GDORVCSANB 



TKLIiRVNIiCE 
KRFPEOLAQQ 
MHKGOfQTLE 

DRHPSQDYVIi TXNYE6ROSP AGSVGCCSBK QEEDGUIFUT III.EPKFITLA EACTKR 



31 

I 

ACKKVIIJJVP 
KRSFTIWLSD 
PFPLFLQQVE 
LIAYASTADG 
DEPDTMHTRL 
QFFGLIGTST 
VNFTIbKGlJB 
TAUniALVTV 
LHDPKGWITI 
NDNPPGILQE 
TAAIU.SYQKN 
WAILAIIiLGI 
FMTQTIWNSS 
VDNCRYTYSK 



I 

SiOiEADKIIS 
KRKQTQKBVT 
SDAAQNYTVF 
YSADLPLPLP 
KYSILQQTPR 
CIITVTOSND 
HGHFKISTOK 
HVRDLDEGPE 
OEISGSIITS 
YWICKPKMG 
AGFQEYTIPI 
AliFSVLLTIi 



SI 
I 

RVtrLEECFRS 
VLLEHQKKVS 
YSISGRGVDK 
IRVEDENDNH 
SPGLPSVHPS 
NAPTPHQNAY 
ETNEGVI.SW 
CTPAAQYVRI 
KIU3REVETP 
YTDIIAVDPD 
TVKDRAGQAA 
VCGVFGATKG 
MKHGGQETIE 



Seq ID NOt 33 I 
nucleic Acid Accession I 
Coding seqpience: 64-2583 

21 



Eos sequence 



CCQATGGCCG 
CTGACCCTCG 
CCTTCTAAAC 
TCT6CAGACC 



I I 

GCTCTCGGCA CCCTCCCGGC 
CCGCTGGGCC CCGGCGCTCC 
TGATCTTCRG TCaTGATGGT 
TA6AGGCAGA CAAAATAATT 



31 



QACAAAAGQA 



ATTCCTTGCT 



AAAGAACCTT 
CCTGTGGATC 
GGATATTCAG 



AACAGACACA GAAAGAGGTT 
OACACACTAG AGAAACTGTT 
CTATGCAAGA GAATTCCTTG 
CAGCACAGAA CTATACTGTC 
TAAATTTGTT TTATATAGAA 
GTGAAQAATA TGATGTTTTT 
CaORTCTGCX: CCTCX:CACTA 
TCACAGAAGC AATTTATAAT 



I 

GCCCGCGTTC TCXTO3CCCT GCCCGGCATC 
GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 
GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 
GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 
GATTTCASAB TTCTAAATGA TGGGTCAGTG 
AAOAAAAQAT CATTTAOaT ATOGCTTTCT 



GGCCCTTTCC 
TTCTACTCAA 
AGAGACACTG 
GATTTGATTG 
CCCATCAGGQ 
TTTGAAGTTT 



CCAAGAGGAG 
CATTGTTTCT 
TAAGTGGACG 
GAAATCTATT 
CTTATGCGTC 
TAGAGGATGA 



TCATTGATAA 
ACTTQTATCA 
TATGAAGCAT 
GATAAGGATT 
GAAAATGGAC 
GTAAAGCCAC 



TAATCACCAC AGTCTCTCAT 
TGfiAAGTACA AGACATGGAT 
TAACAOTAAC AGATTCAAAT 
TTGTAGAGGA AAATGCATTC 
TAATTAACAC TGCCAATTGG 
ATTTCAAAAT CAGCACAOAC 
TGAATTATGA ASAAAACCGT 
TTGCTAOAGA 



kTTCCCAGA 



GGCCAGTTTT 
GATAATGCAC 
AATOTGOAAA 
AGAGTCAATT 
AAAGAAACTA 
CAAGT6AACC 
GTGACAGCCC 



ATTAAAGAAA 
AATA6AAATQ 
ATTGATGAAA 
CCCAAAAATG 
ACTGGAACAC 
GAATATGTAG 
QATGAACCIO 



ACTTAGCAGT 6GGGTCAAAG 
GCAATGGTTT AAGGTACAAA 
TTTCAGGGTC AATC»TAACT 
AGTTGTATAA TATTACAGTC 
TTGCTGTGAA CATTGAAGAT 
TCATTTGCAA ACCAAAAATQ 
TCCATGOAGC TCCATTTTAT 



ACTTCAAGGA 
ATAGCACTGC 
GQGAAACGTT 
CCTGGAQACG 
AOCCAAGGTT 



TCTTTTCTGT 
TTCCTGAAGA 
ATAGAGTGTG 



ACCCTGGACT CCTGCAQGGG 
GTTTTACTCA 
ACATAAAAGA 
GTCCTCACTT 
GAAAAOCAGG 
TTAGCAQAAG 
ATTCTGGAGG 
GATTTTTTTC 
TGCTTATCTT 
CAGCACIOOA 
TAATAAATAX 
ATTATGTATT 
TGIOAAQAAA 
TCATAAAOAA 



TAATCTGTGT 
AATACTTGGA 
ATTGCTAACr 
TTTAGCACAG 
CTCTGCCAAT 
TATGGGATCA 
CCAGACCTTG 
AGGACACAOS 



ATCAAGGGCI 
AAATTGCATG 
TCCAAAATCC 
CTGGCAATAG 
GTAAATGATA 
GGGTATACCG 
TTCAGTTTGC 
GATACAGCT6 
ATTACTGTAA 
GAAXGTACTC 
AAATGGGCAA 
TTAGTATGTG 
CAAAACTTAA 
GGATTTATGA 
GGAATGAAAA 
GAATCCTGCX: 
QAGGTGGACA 



TTGGATTGAT 
CCACTTTCAG 
TCTTAOQAAT 
TTACCATTTT 
ATGRAGGTGT 
TGGAAATTGQ 
TOAACIVfiAfiC 
CIGCAGCOCA 
ATAAGGCATA 
ATCCTAAAGG 
TGGATAQGGA 
ACAAAGATGA 
ATCCACCAGA 
ACATTTTAGC 
CCAATACTTC 
CCGGTCTTTC 



ATGGGCACCT 
TCAACAAGTT 
TGGAGTTGAT 
TTGCACTCGG 
AACTGCAGAT 
AAATGACAAC 
TAGACCTGGT 
GCATACGOSC 
TGTGCATCCC 
AGACAAGTAC 
AGGCAC»TCA 



ACCTATAGAA 
AAAGGGAAAT 
TCTTTCTGTT 
AGTAAACAAT 
CXTGGTTACA 
ATATGTGOGG 



ATCXaACTCA 
TCCTTGCAAT 
GAGTTTTTGG 
TTATATCAAA 
CCCRAACTAC 
ATGGAGGGCA 



TTGGATCACC 
GGTTGAAACT 
TAGATCATGT 
AATACTTCftA 
TGTTQATCCT 
TCCAGAAATC 
ATATCAGAAA 
CGGCCAAGCT 



TAAAAATTAA 
CCAAGATTAT 
CTGCTGCAGT 
ATTTATTACA 
TTTGTCAGAC 
TQTATATGAT 



ACTGCAGATA 
CCATTAGAGG 
AATTGCATOG ATGTAATCAQ AATOAAGACC 
ATAACTATGA GGGAAGAGGA TCTCCAGCTG 
AAfiAAGATGG CCTTGACTTT TTAAATAATT 



TTTCCAAAAA 
TCRATTTTGA 
TTCXauUVAAG 
ATTAAGGTCr 
GCTGGATAAA 
CACTTTAAGT 
QTTTTGGAA7V 
TT8GGACTCA 



ATTATGCTAC 
TGRAAAATGT 
CTAAAGCATC 
TATTAGTCCA 
GATAGTTTAA 
AGAAACAATG 
CCCXTTACTGC 



AAGrrCAATT 
TCACCAATTT 
TAAAACAQAC 
TGCTCTTTTT 
ACAATAGCTA 
AAAATAAACA 
AAGACTGAAT 
ACTACCAAAT 



TGCAACTAAA 
CACAGAAGCA 
CAACAACTCT 
GGAAACCATT 
GCATCATCaT 
CACTTACTOQ 
ACACACTGGT 
GCaTGCCATC 
GTTCTGTGGG 
TGGAACCCAA 
CAATTAGGTC 
TCAACATGTA 
ATATTTTTAA 
AACTGGTAAA 
TTTTTTTACG 
AGTTATGCTA 
AGAAATATTG 
TAAATTAAAA 
TCATTTGACT 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 

2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
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TTGGRGGCAA AATGTGTTGA AGTGCCCTAT GAAGTftGCRA TTTTCTATAG CaATATAGTT 336Q 

GGAAATAAAT GTGTGTGTGT ATATTATTAT TAATCAATGC AATATTTAAA ATOAARIGAG 3420 

AACAAAOAOG AAAATGGTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAG AC3AGCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTOAGTCTAT 3S40 

GAGGAAATAG TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTAAATA AATTAAACTT 3600 

TTCTGQTTTC TGIGGGAAGG AAATAGGGAA TOCAATGGAA CAOTAGCTTT GCTTTGCAGT 36S0 

CTGTTTCAAe ATTTCTGCAT CX»CAAGTTA GTAGC31AACT GGGGAATACT CX3CTGCAGCT 3120 

GGGGTTCCCT GCTTTTrGGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGG GAG CTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCX:TAG 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3960 

ATTGTCCTTA AACCTAAGOC CCACAAACTT GACAOCTGAT CAtSGTCTGGG AQCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGQCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTQA CTCC3USGTTT TCCaCCATCC TTCAGOGTGA ATTAATTTTT AATCSVGrrTG 4200 

CTTTCTCCaG AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTC3VTTTT AOAACAQAGO GAACTTTGOa AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTAC3\. GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAA GSGCAAG OAGAGGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT OGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 

CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCX3GT GAGGOATCAB 4S00 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCT6AC 4S60 

TOCATQATGA QTCTGAAGGC ATTTGCAGOft TGAGCCTGAA CTGGTTOTGC AGAACAAACA 4630 

AGGC31TTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTT GTGGG CA CTAAOAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTO TTTTCTAATT 4740 

IGACXCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TIGAGACGQA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 48S0 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTITTQT ATTTTTAATA GAGACGG6GT 4980 

TTCACTGTCT TAGCCAGGAT GGTCTCX3ATC TCCTGRCCIC GTGATCOGCC TGCCTCGGOG S040 

TCCXaAAQTG CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTCTTT TCOGTTXAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CATGTQTGAA AOTTQATCAT ACQAAnGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACaG TCGAGAAGCC AGGGGGAtSAA AGAACTCAGG 5220 

GCACAAAATA TTGGTCTGAG AATOGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGXAACCAGA AGCCAG T Tl'T ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCaCTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTCCCX3 AIAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTOTT TATAAAACCT CTAACCATCT CTTTGTrCTT 5460 

TGAACATGCT GAAAACCACC TGGTCTGCM GTATGCCOGA ATTTQTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACRTATGTAG TA TTATTA TT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAaTOT GCAAQAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCXAO TGAATGATTT AGAATTrrrr ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TQCTATTAAA AGAAQTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATQCAT AAAGTAATAT TTACAGATGT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGC3UVAA ATTGQATGCA TAAAGTAATA S940 

TTTACAGATO TGGGGAGATG TAATAAAAC3V ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT AAATAATTCT AAGAIGATCA CTTTGCAAAA TTAIGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TQTCTTTOTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCRA ATTGTOOACA TCATTAATAT ATATTQTAAT GTTOQGAASA GATC31CTATT 6180 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TOATACATAT QTATAATAAA TTTTOATCGG 6240 

GTATTAAAAG TATTAGAAGO TQGTTATAAT TQCftOAGTAT TCCATOAATA QTACACTGAC 6300 

ACAGGGGTTT TACTTTGftGG ACCAOTGTAG TCAAGGGAAA ACAT6AGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCrrCACA GTGAATCTTT TCCXTCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CX3TTAAGACT GATCATTTCA AAAATCTATT AGCTATATCR 6540 

AAAGCCTTAC ATTTTAATAT AGGTTQAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT •riv m 'lAi'QT CTTCAAGAAT GTTCATTGGA TTTTTCTTTG TAATAGTAAA 6660 

ATACCQOKTA CATTTCACGT GTOCTTCAST ATTGATTTGO TTGAATATTG GQTCATAATG 6720 

GTTOVGAAGC ATGCAGACrTA GAGCCAGAAT GCTTOOATAT QA&TCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATQAAC3UVTG CCAGCXTTCSVT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT aTGAQGTAGT TCB TAAA ATT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGT6 7020 
CATATATATA ATCXXX3AAAC ATG 

Seq ID NO: 34 Fxoteln sequeiice) 
Pxocein Accession B: HP_077741.1 

1 11 21 31 41 51 

MAAAGPRRSV RGAVCLHLLL TLVIFSHDGE ACKKVILMVP SKLEADKIIQ RVIII.EECFRS 60 

ADLIRSSDFD FRVLNDGSVY TARAVALSDK KRSPTIHIia) KRKOXQKEVT VLI^HQKKVS 120 

KIRHTRETVIi RRAKRRWAPI PCSMQENSLG PPPLPLQQVE SDAAQIIYTVF YSISGRGVDK 180 

EPIiHIiFYIER DTGNIiFCTRP VDREBYDVFD IiIAyASTADG YSADIiPLPLP IRVEDEKDNH 240 

FVFTBAIYHF BVLBSSRPGT TVGWCATDR DBPDTMHTRli KYSIMJQTPR SPGI.FSVHPS 300 

TGVZTTVSHY IBREWDKYS LIMKVQDMDG QFFGLIOTST CIITVTDSND NAPTFRQNAY 360 

EAPVKEHAFK VEILRIPIED KDLIKTAITOR VNFTILKGNE NGHFKISTDK BTNEGVLSW 420 

KPUIYEENRQ VNLEIGVNKE APFARDIPRV TALNRALVTV HVRDLDESPE CTPAAQyVRI 480 

KENIAVGSKI NGYKAYDPEM BNGNGLRYKK IflDPKGWITI DEISGSIITS KILDREVBTP 540 

KNELVNITVIj AIDKDDRSCr GTIAVNIEDV NDKPPEIMJE YWICKPKM6 YTDIIAVDPD 600 

BPVHGAPFYF SLPNTSPEIS RLWSIiTKVND TAARLSYQKH AGFQBVTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVIMK WAIUVIUiGI AMiFSVUiTIi VCGVFGATKG 720 

KRPPEDLAQQ NLIISNTEAP GDDRVCSANQ FMTQTTNNSS QGFCGTMGSG ^aCNGGQBTIE 780 
««B3GGMQTi:.E SCRSAGHHHT LDSCRGGHTE VDNCRrTVSB HH5FTQFRLG EESIRimTG 

Seq ID HO: 35 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence* 146-1273- 
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1 11 21 31 41 51 

I I I I I I 

GGGAGTCGGC GTGGCGGTGC TGCCCAGGTG ACSCCACXGCT GCTTCTGCCC AGACACGGTC 60 

GCCTCCACAT CCAGGTCTTT GTGCTCCTCQ CTTGCCTGTT CCTTTTCCAC GCATTTTCCA 120 

GGATAACTGT GACTCCAGGC CXS3C3UI.TGGA TCCCCTGCAA CTAGCAAATT CaSGCTTTTGC 180 

CGTTOATCTG TTCRAACAAC TATGTGAAAA GGAGCCACTG GGCAATGTCC TCTTCTCTCC 240 

AATCTGTCrC TCCACCTCTC TOTCACTTQC TCAAGTGC3GT GCTAAAGGTG ACACTGCAAA 300 

TGAAATTGGA CAGGrTCTTTC ATTTTGAAAA TGTCAAAGAT ATACCCTTTG GATTTCAAAC 360 

AGTAACATOG GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 420 

CTACX3TAGAC AAATCTCTGA ATCTTTCTAC AQAGTTCSV^TC AGCTCTACGA AGAGACCCTA 480 

TGCAAAGGAA TTGGAAACTG TTGACTTCAA AGATAAATTG GAAGAAACGA AAGGTCAGAT 540 

CAACAACTCA ATTAAGGATC TCACAGATGG CCACTTTGAG AACATTTTAG CTGACAACAG 600 

TGTGAAOGAC CAGACCAAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAAGTGGAT 660 

OAAGAAATTT CCTGAATCAG AAACAAAAGA ATGTCXTTTC AGACTCAACA AGACAGACAC 720 

CAAACX3«3Ta CAQATGATGA ACATGGAGGC GAOGTTCTGT ATGGGAAACA TTGACAGTAT 780 

CAATTGTAAlG ATCaiTAGAGC TTCXTTTTTCA AAATAAGCAT CTCAGCATGT TCATCCTACT 840 

ACCCAAGGAT GTGGAGGATG AGTCCACAGQ CTTGGAGAAG ATTGAAAAAC AACTCAACTC 900 

AGAGTCACTG TCACAGTGGA CTAATCCOtf; CACCATGGCC AATGCCAAGG TCAAACTCTC 960 

CATTCCAAAA TTTAAGCTTCG AAAAGATGAT TGATCCXXAG GCTTGTCreG AAAATCTASG 1020 

GCTGAAACaT ATCTTCACjTG AAGACACATC TGATTTCTCT GGAATCTCaKJ AGACCAACXSO 1080 

AGTGGCCCTA TCAAATGTTA TCCACAAAGT GTGCTTAGAA ATAACTGAAO ATGQTGGGGA 1140 

TTCXa.TAGAG GTGCCAGGAG CAOSGATCCT GCAGCACAAG GATGAATTGA ATGCTGACCA 1200 

TCCCTTTATT TACATCATCA GGCACAACAA AACTCGAAAC ATCATTTTCT TTGGCAAATT 1260 

CTGTTCTCCT TAAGTGGCAT AGCXICATGTT AAGTCCTCCC TGACTTTTCT GTQQATGCCQ 1320 

ATTTCTGTAA ACTCTGCATC CSVGAGATTCA TTTTCTAGAT ACAATAAATT GCTAATGTTG 1380 

CTGGATCAGG AAGCCS3CCAQ TACTTGTCAT ATGTAGCCTT CACSWaGATA GACCTTTTTT 1440 

TTTTTCCSyVT TCTATCTTTT GTrTCCTTrT TTCCCRTAAG ACAATGACAX ACGCTTTTAA 1500 

TGAAAAGGAA TCACGTTAGA GGAAAAATAT TTATTCRTTA TTXGTCAAAT TCTCOGGGGT 1560 

AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTCCTAT AAGOAAGATT TGOAAGCTCT 1620 

TCTTCCX:AGC ACTATGCTTT CCTTCTTTGG GATASAGAAT GTTCCAGACA TTCTGGCTTC 1S80 

CCTGAAAGAC TGAAQAAAGT GTAGTGCATG GGACCCACX3A AACTGCCCTG GCTCCAGTGA 1740 

AACTTGGGCA CATGCTCAGG CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCTQGCAO 1800 

GCAGGTGTTT ATTAAAATTC TQAATTTTGG GGATTTTC31A AAGATAATAT TTTACATACA 1860 

CTGTATGTTA TAGAACTTCA TGGATCRGAT CTGGGGCAaC AACCTATAAA TCAACACCTT 1920 

AATATQCTGC AACAAAATGT AGAATATTCA GACAAAATGG ATACATAAAG ACTAAGTAGC 1980 

CCATAAOGOQ TCAAAATTTG CTGCCAAATG CGTA1GCCAC CAACTTACAA AAACACTTCG 2040 

TTCGCAOAlQC TTTrCAOATT GTOGAATGTT GaATAAQaAA TTATAOACCT CIAOTAGCTa 2100 

AAATQCAAGA CXXXAAGAGG AAGTTCRQAT CTTAATATAA ATTCACTTTC ATTTTTGATA 2160 

OCTQTCCCAT CTGGTCATQT GQTTGGCRCT AGACTGGTG6 CAGGGGCTTC TAGCTGACTC 2220 

GCACAGGGAT TCTCACAATA GCCGATATCA GAATTTGTGT TGAAGGAACT TGTCTCTTCA 2280 

TCT7«,TATGA TAGCGGGAAA AGGAGAGGAA ACTACTGCCT TTA6AAAATA TAAGTAAAGT 2340 

GATTAAAGTG CTCACQTTAC CTTGACACAT AGTTTTTCAG TCTATGGGTT TAGTTACTTT 2400 

AGATGGCAAG CATGTAACTT ATATTAATAG TAATTTGTAA AGTTGGGTGG ATAAGCTATC 2460 

CCTGTTCCGG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTTG 2520 

TGACATTCCT TCTCCCATCT CTTCCTTOAC ATGCATTGTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTQAATTTCT CCTATGCTAT TGACAATAAA ATATTATT6A ACTACC 

Seq ID HO: 36 Protein sequence; 
Protein Accession »: irp_002e30.1 

1 11 21 31 41 51 

I I I I I I 

^mALQI<ANSA PAVDLFKQLC EKEPI.(aiVU> SPICLSTSLS LAOVGAKGOT ANBIGQVLHF 60 

EHVKDIFFGF QTVTSDVNKb SSFYSUOiIK RLYVDKSUO. STBPISSTKR PYAKEIjETVD 120 

FKDKLEBTXG OINHSIICDLT DGHFENIIiAD NSVHDQTKXIi WNAAYFVGK WMKKPPESBT 180 

KECPFRLNXT DTRFVQMMNM EATFCMQNZD SINCKIIELP FQNKHLSMPI LLPKDVEDES 240 

TGLEKIEKbL NSBSLSQWTII FSTHASAKVK I.SIPKFKVEK MIOFKACLEN LGLKHIFSED 300 

TSDFSGMSBT KGVALSNVIH KVCLBITBDG GDSIEVFQAR ILQHXDELHA DHPFIYIIRH 360 
NKTHNIIFFG KFCSP 



Seq ID NO: 37 DNA sequence 

Nucleic Add AcceGsion ft: im_016B583 

Coding sequence: 72-842 

1 11 21 31 41 51 

I i I I I I 

GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTQAGA CCTCTKAGAA GTCCAGATAC 60 

TAAGAGCAAA GATOTTTCAA ACTGGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 120 

CCATGGOXa GTTTGGAGGC CTGCCCGTGC CCCTSQAOa GACCCTOCXX: TTSAATSTQA ISO 

ATCXM.GCCCT GCCCTTGAGT CCCACAGGTC TTGCAGOAAG CTTOACAAAT GCCCTCAQCA 240 

ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TTCTGGAAAA CCTTCJCGCTC CTGGACATCC 300 

TGAAGCCTGG AGGRGGTACT TCTGGTGGCC TCCTTGGGGG ACTGCTTGGA AAAOTGACGT 360 

CRGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTQACCCC CAGCTGCTGG 420 

AACTTGGCCT TQTGCAGAGC CCTGATGGCC ACOGTCTCTA TGTCACCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATAOG CXXCTGGTOG GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 540 

TGGACATCAC TGCAGAAATC TTAGCIOIOA GAGATAAGCA GQAGAGGATC CACCTGGTCC 600 

TTGGTOACTa CAOCX»TTCC CCTGGAAGOC TGCAAATTTC TCTGCTTOAT GGACTTGGCC 660 

CCCTCCCCAT TOVAGGTCTT CTG6ACAGCC TCACM3QGAT CTTGAATAAA QTCCTGCXrrG 720 

AGTTGGTTCA GG6CAACGTG TGCCCTCTGG TCAATGAGGT TCT CAGftG GC TTQGAC3VTCA 780 

CCCTGGTGCA TGACATTGTT AACATGCTGA TCCACGGACT ACASCTTOTC ATCAAGQTCT 840 

AAGCCTTCCA GGAAGGGGCr GGCCTCTGCT GAGCTGCTTC CCAGXGCTCA CAGATGGCTG 900 

GCCCATQTGC TGQAAGATGA CACAGTTGCC TTCTCTCXXSA GGAACCIGCC CXXTTCTCCTT 960 

TCCCSVCCAGG CGTGTGTAAC ATCCCATGTG CCTCACCTAA TAAAATGGCT CTTCTTCTGC 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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Seq ID NO: 38 Protein 
Protein Accession #s HP_057e67 



PCT/US02/12476 



10 



I I I I I I 

MFQTGGIiIVP YGIjIiAQTMAQ FGGLFVPtOQ TLPLNVNPAL PIiSPTGLAGS LTNALSNGLL 
SGGLLGILEN LPUiOIIiKPG GGTSGGIiLGG IiLGKVTSVIP GLHNirDIKV TDPQLLELGIi 
VQSFDGHRLY VTIPLGIKLQ VNTPLVGASL LRIAVKU)IT AEIIAVRDKQ BRIHLVIiGDC 
THSPGSIOIS LIiDGLGPIiPI QGLLDSLTGI LHKVI.PEI.VQ GNVCPLVNEV LRGIJ3ITLVH 
DrVNMLIHGL QPVIKV 



Seq ID HOt 3 9 DKA sequence 
Hucleic Acid Accession #: NK 
Coding sequences 115-2223 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



CTCRGGGCAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGTOG 
TTTGGCTACA 



AGGGAGGAAG 
CAAGCTCTTC 
CCCCTCCCCA 
CCTTCTGGAA 



I 

GACAGC3U3AC 
TCCACAOAGQ 
CAGATGOTGC 
CCCGCCCACC 
GGAGGTQCTT 



OTGGCCTTCR 
CACSAOCCTCC 
TTCAATGTCR 



AGTCAQATCT 
AOCCCTCCAT 
CCTOTGAACC 



TACCCX»GGG 
CCAGAACATC 
TGTGAATGAA 
CTCCAiGCAAC 



ACTGCCAAGC 
CTACTTOTCC 
GTGGATGGCA 
CCCGCATACA 
ATCCAGAATG 
GAAGCAACTG 
AACTCCAAAC 



TCACTATTGA ATCEACGCC3G 
ACAATCTGCC CCAGCATCTT 
ACCGTCAAAT TATAGGATAT 
GTGGTCGAGA GATAATATAC 
ACACAGGATT CTACACCCTA 
GCCAOTTCCG GGTATACCCG 



TCCOCTCTAA 
TCTAACCCAC 
GAGCTCTTTA 
AACTCAGACA 

cxx»AACXxrr 



CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG 
CAAGAAATGA 
GTGATTCAGT 
ACACATCTTA 
CTGCACAGTA 
TCCXrCAACAT 
CTGGCXrrCAA 
TCATCACCAG 
AACCTGAGAT 

CTCCCGGTCA 
GTCACAAGGA 
CACA6CGACC 
TCATACACXrr 
CCACCTGCAC 
TTTATCTCCA 
GCCAGTGGCC 
CCCTCCATCT 
TGTGAAOCIG 
GTCAGTCCCA 
AGAAATGACG 
GACCCAGTCA 
TOGTCTTACC 
CCGCAGTATT 
GCCAAAATCA 



TGAGACTCAG GACGCAACCT ACCTGTGGTG 



CTGTCCAATG 
TACAAATGTG 
GTCCTCTATG 
GAAAATCTGA 
GTCAATGGQA 
AATAGTGGAT 



ATGATOTAQG ACCCTATGAG 
C3VGTCATCCT GAATGTCCTC 
ATTACCGTCC AGGGGTGAAC 
AGTATTCTTG GCTGATTGAT 
ACATCACTGA GAAGAACAGC 
ACAGCAGGAC TACAGTCAAG 
OCAGCAACAA CTCCAAACCC 



AACCCCGTGG 
ACCTAOCraX 
AATSACAACA 
TGTGGAATCC 
TATGGCCCAG 



CCTATACGTG 
TCACAGTCTA 
AGGATGAGGA 
GGTQGGTAAA 



GGTAAACAAT 
CCTOICTCIA 
CCCAGTGAGT 
CCCCACGATT 
CCACGCAGCC 
ATCCACCCAA 
CCAAGCCCAT 



TGCT6TAGCC 



GGGAACATCC 
GGACTCTATA 
ACAATCACAG 



CTCTCAGCTG 



TAAAGCRTTT 
AGACTCTGAC 
AAAXACAAAA 
TGAGGCAGGA 
ACTGCACTCC 
TCTGACCTGT 
AACTTTAATG 
TAATTAATTT 
TTCCCAGATT 
AAATATACTT 
AGACTTGG6A 
TCAATAAAAA 



TTTCGGGAGC 
CTTGGCGTAT 
03CCAAATAA 
ATTCCATAGT 
GGQCCACTGT 
GGTGTAGTTT 
GCAACAGCTA 
CAGAGATCGA 
ATGAGCTGGG 
GAATCGCTTG 
AGTCTGQCAA 
ACTCTTGAAT 
AACTAACTGA 
CATGGGACTA 
TCAGGAAACT 
TTGTGAACAA 
AACTATTCAT 
TCTGCTCTTT 



TGTATGTGGA 
CCTCTATGGG 
GAACCTCAAC 
CAATGGGATA 
TAACGGGACC 
CAAGAGCATC 
CGGCATCATG 
CTTCATTTCA 
CAGTCTAAAA 
GACCATCXTTA 
CTTGGTGGCG 
AACCCGGGAG 
CAGAGCAAGA 
ACAAGTTTCT 
CAGCTTCATQ 
AATGAACTAA 

AAATTGAGAC 
GAATATTTAT 
OTATAACAGA 



ATCC3«3AACT 



CCGCAGCAAC 
TATGCCTGTT 
ACAGTCTCTG 
ATTGQAGTGC 
GGAAGACTGA 



GCCAACATCG 
03CACCTGTA 
GTGGAGATTG 
CTCXATCTCA 



AGAAOGAATT 
ACGACCCCAC 
CCTGCC3\TGC 
AGCAACACAC 
CCTGCCAGGC 
TCTCTGOGGA 
AGGATGCTGT 
TAAATGGTCA 
TCACTCTATT 
CAGT6A6TGC 
CtaTCATTTC 
ACTCGGCCTC 
ACACACAAGT 
TTGTCTCTAA 
CATCTGGAAC 
TGGTTGGGGT 
CAGTTGTTTT 
ACCAAGGATA 
TGAAACCCCA 
GTCCCAGTTA 
CAGTQA6CCC AOATCGCACC 



AAGTGTTGAC 
CATTTCCCCC 
AGCCTCTAAC 
ACAAGAGCTC 
CAATAACTCA 
GCTGCCCAAG 
GGCCTTCACC 
GAGCCTCCCA 
CAATGTCACA 
AAACaSCAGT 
CCCCCCAGAC 
TAACCX»TCC 
TCTCTTTATC 
CTTGGCTACT 
TTCTCCTGGT 
TGCTCTGATA 
GCTTCTTCCT 
TTTACAGAAA 
TCTCTACTAA 



AAACTOIOCA CCAASATCAA GCAGAGAAAA 
TGAGGATTGC TGATTCITTA AATGTCTTGT 
TAAGCTATCC ACTCTTACAG CAATTTORTA 
ATTTACATTT TCKOCTATG TQOTCX3CTCC 
ATTGTATGOT AATATAGTTA TTGCACAAOT 



1200 
1260 
1320 
1380 
1440 
1500 

1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 

2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



31 



41 



75 
80 
85 



MESPSAPPHR 
HtFGYSWYKG 
TLHVIKSDIjV 
NNQSIjPVSPR 
TISPLNTSYR 
AHNSDTGUJH 
QSLFVSPRLQ 
SPSKTYVUPG 



KCIPHQRLUi 
ERVDGNRQII 
NEEATGQFRV 



TASLLTFUHP 
GYVIGTQQAX 
YPEIjPKPSIS 



TTVTTITWA 
LSNDHRTIiTIj 
VIILSI.SC31AA 



EPPKPFITSH 
LSVTRHDVGP 
SNPPAQYSWL 



I I 
PTTAKLTIES TPPNVAEGKE VIAr,VHMLPQ 
PGPAYSGRBI lYFHASLLIQ KIIQHOTGFY 
SNNSKPVEDK DAVATTCEPE TODATYLHWV 
ASYKCBTQMS VSARRSDSVI LNVLYCPDAP 
HFVNOrFQQS TQELFIPNIT VHSSGSnrTCQ 



I^PVSPHIiQIjS nonrtltlfn vtrndaeayv 

PDSSYLSGAN tULSCHSASN PSPQYSWRIH 
ATQSNHSIVK SITVSASGTS PGLSAGATVG 



VECGIQNELS VDHSDPVIIM VLYGPDDPTI 
IDGmOQHTQ EIiFISNITEK NSGIiYTCQAn 
KPVEDKDAVA FTCEPEAQNT TYLWKVNGQS 
CGIQNSVSAN RSDPVTU3VI. YGPOTPIISP 

GIPQQHTQVL PIAKITPNIW G 

IMIGVLVGVA LI 



204 



wo 02/086443 



PCT/US02/12476 



Seq ID NO! 41 DNA sequence 



1 11 21 31 , 41 51 

I I I I 1 I 

AATCXXQACA. ATGGCOAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TQCT GATTTT 
TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTQT 

10 ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 
GGCTGCCTGG ATCGGCATAT TTGTGGGC3VT CTaCCTCTTC TGCCTQTCTG TTCTAOGCAT 
TGTAGGCATC ATCAAGTCCA GCAGGAAAAT TCTTCrGGCQ TATTTCATTC T BATGTTT AT 
AQTATATGCC TTTQAAGTGG CATCTTGTAT CACSVGCftGCA ACACAACGAG ACTTTTTCAC 
ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 

15 TSATGACCAG TG6AAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA T6CTCCAGGA 
CSkATraCTGT GOGGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCaSGAC 
TGASAATAAT GATGCTGACT ATCCCTGGCC TGGTCAATGC TGTGTTATGA ACAATCTTAA 
AGAACCTCTC AACCTGOAQG CTTGTAAACT AGGCGTGCCT GQTtTTTATC ACAA TCAGGG 
CTGCTATGAA CTGATCTCTG OTCCAATQAA CCQACAOBCC TGGGGGGTTG CCTGGTTTGG 

20 ATTTGCCRTT CTCTGCTGOA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGOAGCAO 
AATTGAATAT TAAGAA 



25 



seq ID NOs 42 protein sequence: 
Protein Acceaflion #: NP_00a883.1 



MAKDNSTVRC FQGLLIFGNV IIGCCGIAiT AECIFFVSDQ HSLYPUiEAT DNDOIYGAAW 
IGIFVGICLP CLSVLGIVGI MKSSSKILIA YFILMFIVYA FEVASCITAA TQRDPPTHn. 
30 FIiKQMLBRYQ NNSPPNNDDQ WKNNGVTICIW DRLMLQDNCX: OVNOPSDNQK YTSAFRTQIN 
DADYPWPRQC CVMNNIiKEPL HLEACKIiGVP GPYHNQGCYE IiISGPKNSBA MaVAWFGPAI 
LCWTFWVliLG TMFYWSRIEY 



35 Seq ID NO: 43 ONA sequence 

Nucleic Acid Accession #: Eos sequence 
Coding sequence: 83-2605 

1 11 21 31 41 51 

40 I I I I I I 

GCCXSGACAGA TCTGCGCGTA TCCTGGAGCC GGCCCAGTTG TGflACTAGGA GAGCTTTGPG 60 

ACCTCTGTCC CAAGCAAGAG AGATGAATGG AGAGTATAGA GGCAGAGGAT TTGGACGAGG 120 

AAGATTTCAA AGCTGGAAAA GGGGAAGAGG TGGTGGGAAC TTCTCAGGAA AATGGAGAGA 180 

AAGAGAACAC AGACCTGATC TGAGTAAAAC CACAGGAAAA CGTACTTCTa AACAAACCCX: 240 

45 ACAGTTTTTG CTTTCAACAA AGACCCCACA GTCAATGCAG TCAACATTGG ATCGATTCAT 300 

ACCATATAAA GOCTOaAAaC TrTATTTCTC TQAAaTTTAC AGCXHVTAGCT CTCCTTTGAT 360 

TGAGAAGATT C3UkOCATTTG AAAAATTTTT CACAAGGCAT ATTGATTTGT ATGACMGGA 420 

TGAAATRBAA AGAAAGGQAA GTATTTTGGT A<3ATTTTAAA GAACTGACAG AAGGTGGTGA 480 

AGTAACTAAC TTGATACXaG ATATAGCSIAC TGAACTAAGA GATGCACCTG AGAftAACCTT 540 

50 GGCTTGCATG GGTTTGGCAA TACATCAGGT GTTAACTAAG GACCTTGAAA GGC»TaCAGC GOO 

TGAGTTACAA GCCCAGGAAG GATTGTCTAA TGATGGAGAA ACAATGGTAA AlGtGCCACA 660 

TATTCATGCA AGGGTGTACA ACTATGAGCC TTTGACACAG CTCAAGAATG TCAGAOCAAA 720 

TTACTATGGA AAATACATTG CTCTAAGAGG GAOMSTGGTT CGTGTCAGTA ATATA AABCiC 780 

TCTTTGCACC AAGATGGCTT TTCTTTGTGC TGCATGTGGA GAAATTCAGA GCTTTCCTCT 840 

55 TCCAOATGGA AAATACAGTC TTCCCAC3UUV GTOTCXTOTQ CCW3TGTGTC GAGGCAGGTC 900 

ATTTACTGCT CICCGCftBCT CTCCTCTCRC AGTTACGATG GACTGGCAGT CAATCAAAAT 960 

CCAGOAATTQ ATOTCTOATO ATCAORGAOA AGCAGGTCGG ATTCCACGAA C3VATAGAATG 1020 

TOAGCTTQTT CATOATCTTG TGQATAGCTQ TGTCCCGGGA GACACAGTGA CTATTACTGa 1080 

AATTGTCSUiiA GTCTCAAATG CGGAAGAAQG TTCTCGAAAT AAGAATGACA AGTGTATGTT 1140 

60 CCTTTTGTAT ATTGAAGCAA ATTCTATTAG TAATAGCAAA GGACAGAAAA CAAAGAGTTC 1200 

TGAGGATGGG TGTAAGCATG GAATGTTGAT GGAGTTCTCA CTTAAAGACC TTTATGCCAT 1260 

CCAAGAGATT CAAGCTGAAG AAAACCTGTT TAAACTCATT GTCAACTCGC TTTGCCCTOT 1320 

CATfTTTGGT CATGAACTTO TTAAAGCAGG TTTGaCATTA GCACTCTTTG GAGGAAQCCR 1380 

GAAATAC6CA QATGACAAAA ACAGAATTCX; AATTOSGGQA GACCXXX^CA TCCTTGTTQT 1440 

65 TCOAOATCCA GGCX:TAGGAA AAAGTC3«AT GCTACSGGCA GaSTGCAATG TT GCCC CAOO 1500 

TGQCOTSTAT OrTTGTQGTA ACRCCACGAC CfiCCICIGOT CTGACX3GTAA CTCTTTCAAA 1560 

AGATAGTTCC TCTGOAGATT TTGCTTTGGA AGCTGGTGCC CTGGTACTTG GTGATCAAGG 1620 

TA1?rTOTGGA ATCQATGAAT TTQATAAGAT GGGGAATCAA CATCRAGCCT TGTrGGAAGC 1680 

iK^mi^T^jv /— iv-*'nTv-r^R& fuzT'mnrrGTa 14TTT13TAGCC TTCCTGCAAO 1740 



TATtTOTGGA ATOOATGAAT TTQATAAGAT OWaOAAl-UW aviiii»j»ww»»v. 
CATGGAGCAG CAAAGTATTA GTCTTGCTAA GGCTGGTGTG OTTTOTAGCC TTOCTGCSkAO 
AACTXCCATT ATTGCTCCTG CAAATCX^VGT TGGAGQACAT TACAATAAAG CCAAAACAOT 



1800 



70 AACTXCCATT ATTGCTGCTG CAAATCCAGT TGGAGQACAT TACAATAAAG CCAAAACflOT i»uu 

TTCTGAGAAT TTAAAAATGG GGAGTGCACT ACTATCCAGA TTTGATTTGG TCTTTATCCT 1860 

GTTAGATACT CCAAATGAGC ATCATGATCA CTTACTCTCT GAACATGTGA TTGCAATAAG 1920 

AGCTGGAAAG CAGAGAACCA TTAGCAGTGC CACAGXAGCT CGTATGAATA GTCAAGATTC 1980 

AAATACTTCC GTACTTGAAO TAGTTTCTGA GAAGCCATTA TCAGAAAGAC TAAAGGTGGT 2040 

75 TCCTGQAGAA ACAATAGATC CCATTCCCCA CCAGCTATTG AGAAAGTACA TTGGCTATGC 2100 

TOGGCAGTAT GTGTACCCAA GGCTATCCAC AGAAGCTGCT OGAGTTCTTC AAGATTTTTA 2160 

CCTTGAGCTC OGGAAACAGA GCCAGAGGTT AAATAGCTCA CCAATCACTA CCAGGCAGCT 2220 

GGAATCTTTG ATTCGTCTCA CAGAGGCACG AGCAAfieTTG OAATTGAGAG AGGAAGCAAC 22B0 

CAAAQAAGAC GCTQAGGATA TAQTGGAAAT TATGAAATAT AOCIVIXSCTAa GAACTTACTC 2340 

80 TGATGAATTT GGGAACCTAG ATTTTGAGCG ATCOCAGCAT GOTTCIOQAA TGAaCAACAQ 2400 

GTCAACAGCG AAAAGATTTA TTTCTGCTCT CAACAAOGTT GCTQAAAGAA CTTATAATAA 2460 

TATATTTCAA TTTCATCAAC TTCGGCAGAT TGCCAAAGAA CTAAACATTC AGGTTGCTGA 2520 

TTTTOAAARI TTTMTGGAT CACTAAATGA CCAGGGTTAC CTCTTGAAAA AAGGCCCAAA 2580 

AOTTTACCAG CTTCRAACTA TGTAAAAGGA CTTCACCARG TTAGGGCCTC CTGGGTTTAT 2640 

85 TGCSiGATTAA AGCCATCTCA GTGAAGATAT GCGTGCACGC ACACaCAGAC AQACACACAC 2700 

ACACACACAC ACACACACAC ACACACACAC ACACACAGTC AAATACTOTT CTCTGAAAAA 2760 

TOATCTCCCA AAAGTATTAT AATAGGAAAA AAGCATTAAA TATAATAAAC TAATTTAAOA 2820 



205 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
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AGTGATAAAG TCTCCAGATG 
GGTGAGAGGA TTCCTTGAGG 
CATTTCTTAA AAAAAAAAAA 
TAOTCTCAGC TACTTGTGAG 
TAtaGTOAGC CACAATCACA 
GACTCAAAAA AATAAAAAAA 
CCAAAGGGCr AAAAGTAAAT 
TTATATGTAT GAATATTTCA 
AAACCAATGA ATATATTACA 
ATTTGAATTT CATAAAATTT 
GCTATTTAAT AATAGGTCTC 
AATAGAAACA GACTGATTAA 
AATTATTAGA AGGCAGGTGA 
GCCTTAGAAT TGGACTAAGQ 
AGAAAQT6CT GCTGCCTCCC 
GAATGCCCCC ACCCGCACCG 
ATTGCTGAAT TCAAAAAAGA 
TATGCKCCTT TCATAGGCTG 
AATAAGACCA GAATTTCTCA 
AAACTATCAA TCATGTAT7UV 
GTGAACXa\TT GTTGGAGAAT 
AATGTAAATA AAAAGAACTG 
QATGTGGAGA CTATTGCCAT 
GOAATCAAAA GGGGCCAGGT 
GAGGCAGGAG GATCACTTGA 
T&TCTCTACA AAAAATAGAT 
GTGGAGGCTQ AAGTAGGAAA 
TTATACCACT GCACTCCAGC 



CCAGGGTTCS3 
AAAAAATTTA 
GCTGAGGCAG 
CCAATCACTG 
ATTGTAGTGG 
TACTTATAAA 
TAGTTTTGCA 
TATTCTGTGT 
TCCCATGTCA 
ATTTATTCXa^ 
GCAGGAGAAG 
ACCAGGAGGG 
AAGAAGCTGC 
TGCCCCACCT 
GAACAGCAAC 
AGTTGCaVTAC 
CTAGGGAGTT 



CACTGTAATC 
AGACCAACCT 
AACTTAGCTG 
GAGGATTCTT 
CACTCCAGCC 
TAGCCATGTG 
TTTTTTATAG 
TATCAGATGT 
TCCAATAAAA 



ACAGTGACTC 
TGGGCAACAT 
GGTATGGTGG 
TGAGCCCAGG 
TGGGCAATAA 
TTAATTGTTA 
TTGTATTTTT 
AGGCATACAG 



AG6AGGCTGA 
AGCAAGRCCC 
CACATGCCTA 
AGTTTGAGGT 
AGTAACTCTT 
AATAAATTCT 
GACCTGCCTT 
ACAAATACAT 



CAGGCTGTAG 
TTTTTTGAAA 
TAAGCTTCCA 
TGACACTCOV 
TTGCCACTTC 
AAAAGGATTC 
AAAGACATCT 
TTCCTGGTTC 



AXACrrOAGT 
■TTTGTAGTCT 
GftATTTTGTT 
GCAGCAATTT 
CTGCCACACA 
TGCAGCAGGA 



TOCTTGAAAC 



GTAAAACX»T 
GGGCACTGGA 
ATAGGTAGAA 



TGC3VTGAGAT 



ATCCAACAAA 
CTACTAAAAT 
GCAGTGTATA 
AGACCACAAT 
GCAGTGGCTC 
AGCCAGTTTT 
TAGCTGGGCA 
TCACTTGAGC 
CTGGGCAAGA 



CACTTTQTAA 
ACGGCTTCCC 
TCAGATGTTT 
GTAAATTTTT 
ACATCTATAA 
QAGACCAGCC 
CGGTGGTGCA 
CCX3AGAGTTT 



TACTTTCAGG 
AT6TTACAGG 
CATACAAGAA 
GC3UVACGAAG 
AACTATAOGA 



TCCCAGAGCT 
TATQCAACAC 
TGOCTATTGT 



GCCTCCCTAA 
GGGTATGTTA 
TGGTGGGATC 
GTTGCCAGCC 
CTCAGGAAAT 
ATGAATGGAA 
CCAGAACTAA 
GGAAAAATCA 
TTGGGAGTTC 
ATTGAGACCC 
OCTACXTTACT 
TGAGCTATGA 



28B0 
2940 
3OO0 
3060 



34B0 
3S40 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 



4320 
4380 
4440 



PCT/US02/12476 



fts CAB5S276.2 



I I 
MNGBYRGROF GRGRFQSWKR 
TFQSMQSTLD RFIPYKGWKL 
ILVDFKBI.TE GGBVTOUPD 
IiSNDGETMVN VPHIHARVYN YEPLTQIiKNV 



I£AAOGEIQS I 
QREAGRIPST I 
SISNSKGQKT K 



HREREHRPDL 
PDIEKIQAFE 
KTLACMGIAI 
RANYYCICriA 



SKTTGKRTSE QTPQFIiLSTK 
KFFTRHIDLY L 
BQVIiTKDLER E 
LRGTWRVSN IKPLCTKMAP 



. PTKCPVPVCR ratSFTALRSS Pt.TVTMDWQS IKIQELMSDD 
/ DSCVPGDTVT 



TTTTSGLTVT 
LAKAGWCSL 
HDHLLSBHVI 
IPHQLUIKYI 



GSQKYADDKN 
LSKDSSSGDP 
PARTS 1 1 AAA 
AIKAGXQSTI 
GYARQYVYPR 



SAUOIVAERT YNNIFQFHQL 



RIPIRGDPHI 
ALEAGAiVIiG 
NPVGGHVNKA 
SSATVARKHS 
LSTEAARVU] 
VEIMKSSMLG 
RQIAKEUIIQ 



ITGIVXVSHA 
YAIQQIQABE 
LWGDPGLGK 
DQGICGIDBF 
KTVSBNIjKMG 
QDSlirSVI.EV 



NLFKIiIVNSI. 
SQMLQAACNV 
DKMCaiQHQAI. 
SALLSRFDbV 



CMFLLYIESUJ 
CPVIFGHELV 
APRGVYVCGN 
LEAMBQQSIS 
PILLDTPNEK 
KWP6BTIDP 



Seq IS NO: 45 DNA sequence 

Nucleic Acid Accession S: HN_00S41G.] 

Coding sequence: 149.. 658 



AOCAOATCCC 
CTGAAGACCA 
AAAGAGTGTG 
CCCACCACCT 
AATATTTGTT 
AAAGATTCCA 
GCCAGGCTGT 
CAAGGTCCCT 
ACCAGGCAGC 
CAAAGTTCCT 
GCCATGTCCT 
TGGTGCTlCAG 
TGTTTCTGTG 
AGTCTCTCTC 
CTGAAGAATC 
GGCTGCTCAG 
CTCATTAAAT 



AGAOaCTaAA 
GAAAAGCCRC 
TCCACGATCC 

CAGCTTCAAC AGCAGCAGGT 



GAGCCAGGTT 
ATCAAGGTCC 
GAGCAAGGAT 
TCAACGGTCA 
ACAAGCCXTTT 
TCTTAATTGT 
TTATTTGTAT 
CTGTAAGCCC 
GGTTCATCTG 
TGCTTTTAAT 



GTACCAAQGT 
CTGAQCCAGQ 
GTACCAAGGT 
CTGACCMGG 
ACACCAAAGT 
CTCCAGGCCC 

CTOTAGACCT 
CCTAAAAATA 
CTOAATTAAa 
AAGATTCSAA 



31 
I 

TTCTCTGCAC 
TGCTTAATTC 
GAGTTCTTAC 
GAAACAACCC 
CCACTCAAAG 
CCCTGA6CCA 
TT(jrACX»AQ 
CCCTGAGCCA 
CTTCATCAAQ 



CCCTGAGCAG 
AGGATTCTTC 
AGACCTTTAC 



CCACCAGATG 
TGTAATCAQC 
CGCACTATAA 



6GCTACACAA 
AAGACCAAGC 
CTGGACACCC 
ACATTOTCAC 



CAG6CTGTAC 
AGGTCCCTOA 
CAGGTGCCAT 
AGCTACCAGA 
AGAAGTAATT 
TCTTOCCATC 
GCX»AGCCAT 



C CIGCTCTTOC 960 



I I I I I 

MSSYQQKQTF TPPPQLQQQQ VKQPSQPPPQ EIFVPTTKEP 
VPEPQCTKVP EPQOTKVPEP GCTKVPEPGC TKVPEPGCTK 
I IKVPEQGYTK VPVPGYTKLP EPCPSTVTPG 



VPEPGYTKVP EPGSIKVPDQ 



206 



10 



wo 02/086443 
1 11 
I I 

GCGTCGTGIO CAGGCGTCCC 
AAGGCTCGTT AOAATTCGCC 
TTGCAATTAA GCTTAGGGAA 
GAAAATGGAT TAGAGAAACT 
TTTGGGGAAA GTGCCCXX3AC 
AGTCGGCGTT GGCGGCAGCG 
TAAGGATAAC ATCCTGGAAA 
TTGGAGCTGC CCTGTGGAGT 
CTAAAAACTT TOTGAOAATT 



PCT/US02/12476 



21 
I 

CGGGCTGTGQ 
CTAjGAGCTGT 
CCAGCAACAA 
TCTTCCCCGA 
CGCAGAGGCG 
GTGGCCTTCC 
TGACTTCTGT 
TA CASTTT AC 
TTCTTTTACT 



TTTAAGGGQA 
ACGACAGGGG 
TCATCTGGGC 



AAOATTCXriG 



GATGTGGGCT 
CCCAACTGCA 
CATGAACATA 
CTTATTACAA 



51 
I 

CTCATTGCCC 
TTAACTTTGC 
CX5TTCACCX3C 
CGGCCAGCGC 
TGCTCACXXST 
CCTAGAAGAG 
CACTCATGAC 
ATCICATTTA 



20 
25 
30 



1 

TT< 

TTTTAGTAAA 
CTCCAAGTCA 
TCCTTACTCT 
CGGACTACCG 
CCCAAAGCGC 
ATTTTCGCGQ 
TTGCAAGCAA 
AGCCTTGGGC 
CX3ACGCT 



TGAGATTATG 



TCTCX3GAGCC 
TGAGCAGCTT 
TGGCCGCAGG 
TGAACGACCT 
AGTTAATTTG 
AATGAGGGAA 



AATAAGAAAA 
TTCATQAATO 
CTTGQGCTCA 
CACATCGCCC 



I 



GAACGTGTCT 



CTTAAATOGG 
TTGCTTTTGT 
CATGATACAG 
AGTTATCCAC 



ACTGTAACTC 
AAGTCATTTC 
GGCCACOGCT 
TCTGCGGTCG 
GGAAGAAGTT 
TGCTGGTTCC 
CTCTAGGGOG 
AGCCCGGGGA 



CTCACAAAGT 



CAGGATGTTA 
GCCGCCAACG 
GGGCACTTTC 
TCrCTAATCC 
CTAAGCTTAA 



CGCCTGCACA 540 



s CAT cluster 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



r GCTGCTCGTT TGTCTCTCCT GTGCTCTTCr TCTTTCTTTC CCTCGCCGCT 
CTTCTCTGAT GGCGGGGGGC GGGAGAAGCT GACCGGTGAQ 
GG6TGTCACA AGCCGGTCGC OOQCTTTTTT GGQAOAACCC 
CCTGOAAOIG CATGACCATQ TTATTACTAT GGaCOGCCTC 
GTGTTTAAAA CTTTTTAOQO CAOCCCCAAA ATTTTTTTTT TTTTTTTTTT 
AAACTCTAAT ATTTATATTA AATACAAAGA TAC CCAAAC C CTTTATGCTT 
TCTGTGTCTT TTTTCTTTGA CA6CATCTCC ATTTTTTTTC TGCTGCTTCA 
CATGGGAATC CGTTTCATTA TTA1GGTAGC AATATGGAGT GCTGTATTCC 
GACACAGGAG AATCACTTGA ACTTGGGAGG CAGAGTTTGC AGTGAGCCGA 
GTGCACTCCA GCCTTGGCAG CGGAGCAAGA TTCTGTCACA GTTCCTGAAG 
GTCCTGCAGC CCCATCCTCG GTTCCATTGC GCTGCCAGGC AGGGTQCTGG 
GAOGrGaGGA GAGCTGGTCT ATATATCCGG GTGAAQCTCA GCTGTGQCAC ACCTTGOATG 
CCGGGTCTCT CCTGGCCCOG ©SQACCTAOT ATTTTTQCCA COAGTGTACA CCAAACAAAG 
GAGACAGCAT CATTTATGAG CCTGCAGCAT CCACCCTACT GCTGTATCCA GTTTCCATTG 
ACTQ 

Seq ID NO: 50 DMA sequence 
Nucleic Acid Accession #i Ii05187 
Coding sequence; 1991.. 2260 



CCCAACCAAA 
TTCATTTAAA 
CTTTCTCTGA 
TCGCTGTAGC 
TAAAGAAACT 
GATTGAACCA 



TCAQAAAGGA 



GGAAATOGAT 
ATTTCTAGCT 
CCCCTCCCTT 



11 

1 

~ GCAGGTAGAA 
GGAAAA6GCC 
ATT AGCCC CT 
CAGGTTTTCC 
GTAAATTATT 



TCCACCTTCA 
TCCCACCTAT 
CAATGACAAQ 
CTCATGAAAC 
GGGTCTGAGG 
GTGCCAAAAA 



TATAATCAGT 



TCACACCAAA CCCAAC 



I I I I 

AAGGCTTTTG GGTTTTCAGG TGGGGGGCAG TCTAGCCTGA 
AGGGCAGATO TCTGOGIGGA GTGAAGQGAA AAAGTGATCC 
GAAAOTCCCT GAAGTAGGAG AAGGGTAAAG GTGTGGTTGQ 
CAOATTAGCA ACCAGTCAGG GGQAGGAAGO TGAGAGTGGG 
CTGAATOIGI GTAGTTTAAT GQAArTOQaA AAAAGATGGQ 300 
GGACTCTOAG ACAAGGGGTC 
CCAAGGCKQA CAAGQAGGGC 
TCATGTGTGC AAGAGT6CCC 
GACAGCAGGT GGCAAGGCTC 
CCTCCATQAA OCCTGCTGCT 
ATGAGGGTGG CAGTGAAAAT 
ATATCAGCTG GTGTTCATCA 
TCATOTGTGA CAGGTGAGGA 
AGGGCTCATT CATCITAXAA 



TGTCCCACAO 
AACAGGACTC 
CACCCCTCCC 
TAGGCCAGTG 
AATAAGCXSA 
ATGAAAACAG 



CCATTTCATT 
TCCTCTGCTC 
AACACGGGGA 
AGATGTCCCC 
TCAAGGCAAG 
ACATCATTTT 



ATATGTGTAA 
TATTTTAAGT 
CCTCAGTAGA 
AGTTCATAGC 
TGACAAGATA 



AACATAAAAC 
AGTAATTGGC 
AGGAGACCTC 
AGATGGGAAG 
GAGGCTTAGA 
GAGGAAAGTG 
GAAGCCAQCT 
GAGCCAAOAA 



ACT6GCAATT 
GCAGQTTAAT 
TAAATTACAG 
TAGTCATTGA 
AGAACTAGAA 
TTTATAGAAA 
GTATGCTAGQ 
CTAGCAGOAA 



TTTTCATTAT 
AATGGGAGAT 
AAAGGACCTT 
CTGGAGAAOA 
AGCACTCTCA 
TTTTAATTTA TTAOATOGAT 



CTAGTGTACT 
CCAGGGTTTC 
TCTGGATTTG 
ACTGGGAGTC 



GTAXACCAGG TAAaTCTCTG 
TABAAATTAG CTAAAG6CAA 
AGAGAATAGT GGAATATCTT 
AGAGATGGTT AGGGCTCCCA 
TTGTTCAAAT GCCCATGGGA 
GTAACACTGC AATTTCCCCC 
CTCTACTGAQ CATTTATTCC 



GOTAATACAT I 



r AAAIGAAATO CAAAGTAGAT 



TAGGGTGTCA 
AAAAGCATTT 
TGAATATAAA 
GTCTGATGCC 
TTAGTAGGGC 
GAGAACTCXA 
CTGAAAATTA 



AGTGATGIGA 
GGAAGGGACT 
GCCATCCTAT 
ATTTTCCAAA 
ATTTTTCCAG 
ATAAAATGGA 
TCCAAGCTTA 



GCTATGATGG 
GTGXAAGCAC 
AAGTCACAGG 
AGACCTAATA 
AACAGATATA 
GCAGAAGAAA 
TTTCATTTTT 



AGGGGTATTT 
AGACCAGAAO 
CTTTCTACAT 
TGCGGACCTC 
AGQTGCCTTG 
TTGCCTTTTA 
AAATGTAATG 



ATGTCCCTCA 
GGTAGGAAGG 
GCTCCTCCTC 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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wo 02/086443 PCTAJS02/12476 

GGGAGATGAA AGGCrTTCTC TTCTAAAGOS TCCTGAAATA AAATCTGTTT GGCRTTGAAT 1920 

TTC5TATCCAT CTTTCTTTAA TTGAATCACT GTGTCAGCTT TCTGTCTCTA 6AAAAAAACA 1980 

CATTTGAAGC ATGAATTCTC AGCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA 2040 

GCXGCAOCAG GTGAAACAAC CITGCCAGCC TCCACCCX3«3 GAACX3VTGC31 TCCCCAAAAC 2100 

CARGQAGCCC TGCXKACCCA AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG 2160 

CCAlGCCX»A6 ATTCCAGAGC CCTGCCAGCC CAAOGTGCCT OAGCCCTQCC CTTCAAOKST 2220 

CACTCCAGCA CCRGCCCAGC AGAAGACCAA GCAGAAGTAA TGTGGTCCAC AGCCATGCX:C 2280 

TTQAGGAGCT GGCCACTGGA TACTGAACAC CCTACTCCaT TCTGCTTATG AATCCCSVTTT 2340 

GOCTATTGAC CCTGCAGTTA GCATGCTGTC ACCCTGAATC ATAATOGCTC CTTTGCACCT 2400 

CTAAAAAOAT GTCCCTTACC CTCaTTCTGG AG6CTCCTGA GCCTCTGCGT AAGGCTGAAC 2460 

QTCTCakCTaA CTGAaCTAOT CTTCTTOTTG CTCGGGTGCA TTTGAGQATa GATTTGGGGA 2S20 



I I i i I I 

MHSCXXJKQPC TPPFQPQQQQ VKQPCQPPPQ EPCIPKTKEE OQPXVPBPCH PKVFEFGQPK 
20 IPEPCQPKVF BPCFSTVTPA PAQQKrKQK 

Seq ID SO I 52 DNA sequence 
Nucleic Acid Accession »: NM_00263a.l 
25 Coding sequence: 120-473 

1 11 21 31 41 51 

cswvtacagct aaqgaattat cccotgtaaa tacx3^cagac cogocctgga gccaogccaa 
30 gctggactgc ataaagattg qtatgocxot agctcttaqc caaacsicctt cctgacacca 
tgagggccag cagcttcttg atcgtggtgg tgttcctcat cgctggqaco ctggttctag 
aggcagctqt cacgggagtt cctgttaaag gtcaagacac tgtcaaac3gc cgtgttccat 
tcaatggaca agatcccgtt aaaggacmg tttcagttaa aggtcaagat aaagtcaaag 
cx3caagagcc agtcaaagot ccagtctcca ctaagcctgg ctcctgcctc attatcttga 
35 TCOjGTGCGC catqttgaat ccccctaacc gctgcttgaa agatactgac tgcccaggaa 
tcaagaagtg ctgtgaaggc tcttgoggga tggcctgttt cgttccccag tqaagggafic 

CX3GTCCTTGC TGCACCrGriG COSTCCCCAG AGCTAC3WX3C CCCATCIGGT CCTAAGTCCC 
TGCTGCCCTT CCCCTTCCCA CACIGTCCAT TCTTCCTCCC ATTCAQQATQ C!CCAOGGCTG 
GASCTGCCrC TCTCATCCAC TTTCCAATAA A 



45 I I I I I I 

MRASSPIiIW VPLIAGTliVl. BAAVTOVPVK GQQTVKfSVP PNGQDPVRGQ VSVKGQOKVK 60 
AQGFVX6PVS TKPOSCPllli IRCAMIiNPPN RCLXDTDCP6 IKKCCBOSCQ MACPVPQ 

Seq ID NO: S4 DNA sequence 
50 Nucleic Acid Acceaoion «: NM_019618 
Coding sequence: 75-584 

1 11 21 31 41 51 

55 GGC31CGAGCC ACXjATTCSAGT CCCCTGGACT GTAGATAAAG ACCCTTTCTT aCCAGGTGCT 60 

GAGACAACCA CACTATGAOA GCX:ACTCCAS GAGACGCTGA TGGTGGAGGA AGGGCCGTCT 120 

ATCAATCAAT GTGTAAACCT ATTACTGQGA CTATTAATGA TTTGAATCAG OUIGTGTGGA 180 

CCCTTCAGGQ TCAGAACCTT GTGGCAGTTC CACGAAGTGA CAGTGTGACC CCAGTCSVCTG 240 

TTGCTGTTAT CACATGCAAG TATCXaVGAGG CTCTTGAGCA AGGCAGAGGG GATCCCATTT 300 

60 ATTTGGGAAT CCAGAATCCA GAAATGTGTT TGTATTGTQA aAAGGTTGQA GAACAGCCCA 360 

CATTGCAGCT AAAAGAGCAG AAGATCATGG ATCTGTATGG CCSkACCCOAO CCCGTGAAAC 420 

CCTTCCTTTT CTACCGTGCC AA6ACTGGTA GGACCTCtXC CCTTGAGTCr GTOGCXTrTCC 480 

OGGACTGGTT CATTGCCTCC TCCAAGAGAG ACCRflCCCRT CATTCTOACT TCAOAACTTG 540 

GGAAGTCATA CAAC3VCTGCC TTTCAATTAA ATATAAATGA CTGAACTCAG CCTAGAGGTG 600 

65 GCAaCTTGGT crTTGTCTTA AAGTTTCTGG TTCCCAATGT GTTTTCGTCT ACATTTTCTT 660 

AGTGTCATTT TCACGCTGGT GCTGAGACAG GGGCAAGQCT GCTOTTATCA TCTCATTTTA 720 

TAATGAAGAA GAAGCAATTA CTTC»TAGCA ACTGAAGAAC AGGATGTGGC CTCAGAAGCA 780 

GGAGAGCTGQ OTGOTATAAQ GCTGTCCTCT CAAGCTGGTG CTGTGTA GGC CACAAGGCAT 840 

CTGCATOAGT GACTTTAAGA CTCAAAGACC AAACACIGAG CTTTCTTCTA QGGGTGGGTA 900 

70 TQAAGATGCT TCAGAGCTCA TGCGCGTTAC OCRCXSATCGC ATGACTAGCA CSVaAGCTGM 960 

CTCTGTTTCT GTTTTGCTTT ATTCCCTCTT GGQATQATAT CATaaVGTCT TTATATQTTG 1020 

CX3iATATACC TCATTGTGTG TAATAGAACC TTCTTAGCAT TAAQACCTTG TAAACAAAAA 1080 

TAATTCTTGT GTTAAGTTAA ATCATTTTTG TCCTAATTQT AATGTOTAAT CTTAAAGTTA 1140 

„^ AATAAACTTT QTGTATTTAT ATAATAAAAA AAAAAAAAAA AAA 



1 11 21 31 41 51 

80 I I ] I I 1 

MRGTPGDADG GGKAVYQSMC KPITOTINDL HQQVMTUiGQ HLVAVPRSDS VTPVTVAVIT 
CKVPEALEQG RCDPIYVBIQ MPEMCTiYCBK VGEQPTI«ZiK EQKIMDIiYGQ PE»VKPFI.Flf 
RAKTGRTSTL ESVAFPDWPl ASSKRDQPII liTSHU3KS«J TAFEUnND 

85 Seq ID NO: 56 DNA sequence 

Nucleic Acid Accession #: NM_00312S 
Coding sequence: 65-334 
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1 IX 21 

I I I 

AGCAGTTCTA AGGGACCATA CAGAGTATTC 
5 CAGCATGAGT TCXXXGCAOC AGAAGCAGCC 
GCAGGTGAAA CAGCCTTGCC AGCCTCCACC 
GCCCTGCCAC CCCAAGGTGC CTGAGCCCTG 

c3«Gcrrcc» gagccatgcc accccaaggt 

AGCACCAGCC CAGCAGAAGA CCAAGCRGAA 
10 ftSCCGGCCAC CAOATGCTGA ATCCCCTATC 
CAATTAGCAT TCTGTCTCX:C CCAAAAAAGA 
TCTGAGTCTC TGAATGAAGC TGAAGGTCTT 
ATTCATCTQA AGAGAGACTT AAGATGAAAG 
AAATTCACTT TCAATTCCA 



31 41 51 

I 1.1 

CTCTCTTCAC ACCAGGACCA GCCACTGTTQ 60 

CTGCATCCCA CCCCCTCAGC TTCAGCAGCA 120 

TCAGGAACCA TGCATCCCCA AAACCAAGGA 180 

CCACXrCCAAA GTGCCTGAGC CCTGCCAGCC 240 

GCCTGAGCCC TGCCCTTCAA TAGTCACTCC 300 

GTAATGTGGT CCAC3«3CCa.T GCCCTTGAGG 360 

CCATTCTGTG TATGAGTCCC ATrTGCCTTG 420 

ATGTGCTATG AAGCTTTCTT TCCTACACAC 480 

AGTACCAGAG CTAGTTTTCA OCTGCTCAGA 540 

CAAATGATTC AGCTCCCTTA TACCCCCATT 600 



25 



I 1 1 r I 1 

MSSQQQKQPC IPPPQLQQQQ VKQPCQPEPQ EPCIPKTKEP CHPKVPBECH PKVPEPCQPK 
LPEPCHPKVP EPCPSIVTPA PAQQKTKQK 

Seq ID NO: 58 DNA sec^uence 

Nucleic Acid Accession #: NM_001793.2 

Coding sequence: 71-2560 



I I I I I I 

AAAGGGGC31A GAGCTGAGCG GAACACOGGC CCX3CCGTCX3C GGCAGCTGCT TCRCCCCTCT 60 

CTCTGCAGCC ATGGGGCTCC CTCXSTGGACC TCTCQCGTCT CTCCTCXTFTC TCCAGGTTTG 120 

CIGGCTGCAG TGOSCGGCCT CCGAGC03TG CCGGGCGGTC TTCAGGGAGG CTCAAGTOAC 180 

35 CTTGGAGGCG GGAGGCXJOSQ AQCAGGAGCC CGGCCAGGOS CTGGGGAAAG TATTCATGGG 240 

CTGCCXTTOGO CAAGAGCCAQ CTCTCTTTAG CACTGATAAT GATGACTTCA CTOTQCXSQAA 300 

TGGOSAGACA OTCCAGQAAA GAAGGTCACT GAAGOAAAGO AATCXATTGA AOATCTTCCC 360 

ATCCAAAOGT ATCTTACGAA GACACAA6AG AOATTOGGTG GTTGCTCCAA XATCTGTCCC 420 

TGAAAATGGC AAGGGTCCXTT TCCCCCAGAG ACTOAATCAQ CTCAAOTCTA ATAAACmTAG 480 

40 AGACACCAAG ATTTTCTACA GCATCAOGGG GCCGGGGGCA GACAGCCCCC CTGAGGGTGT 540 

CTTOGCTGTA GAGAAGGAGA CAGGCTGGTT GTTGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGTCAGAG AATGGTGCCT C AGTG GAOgA S60 

CCCC»TOAAC ATCTCCATCA TCGTGACCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG TGATGCAGGT 780 

45 GACAGCCACXJ GATGAGGATG ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 840 

CCaTAGCOlA GAACCAAAiQG ACCCACACQA CCTCATGTTC ACCATTCACC GGAGCACS^GG 900 

CACCATCAGC GTCATCTCCA GTGGCCTGGA CCGGGAAAAA QTCCCTGAQT ACACACTGAC 960 

CATCCRGGCC ACAGACATGG ATGGGGACJGG CTCCACCACC AOSGCAGTGG CAGTAGTGGA 1020 

GATCxrrraAT gccaatgaca atgctcccat qtttgacccc cagaastacg aggcccatgt loso 

50 GCCTGAGAAT QCAGTGGGCC ATGAQGTGCA GAGGCTGACG GTCACTQATC TGOACGCCCC 1140 

CAACTCACCA GCQTGGCGTG CCACCTACCT TATCATGGGC GGTGACGACO GGGACCATTT 1200 

TACCATCACC ACCCACCCTG AGAGCAACCA GGGCATCCTG ACaACCAGGA AGGGTTTGOA 1260 

TTTTGAGGCC AAAAACCftGC ACACCCTGTA CGTTQAAGTG ACCAAOGAGG CCCXTTTTTOT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCRCCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

55 ACCTGTGTTT GTCCCACCCT CCAAAGTCKT TGAGGTCCAG GAGGGCATCC CCACTGGGGA 1440 

OCCTGTGTGT GTCTACACTG CAQAAGACCC TGACSiAGGAG AATCAAAAGA TCAGCTACCG ISOO 

C3VTCCroAGA GACCCAQCAG GGTOGCTAGC CATGGACCCA GACAGTGGGC AGGTCACAGC 1560 

TGIGGGCACC CTCGACXSSTG JMSGATGAGCA GTrTOTGMG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGGC AOGGGAACCC TTCTGCXAAC 1680 

60 ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTQAGCCC CGTCAGATCA CC31TCTGCRA 1740 

CCAAAGCCCT GTGCGCCAGG TGCTGAACAT CACGGACAAG GftCCTGTCTC CCCACRCCTC 1800 

CCCTTTCCAQ QCX:CAGCTCA CAGATGACTC AGACATCTAC TGGAOQGCA6 AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTQ AAGCAQGATA CATATOACXJT 1920 

GCaCCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG AOGGTGATCA GOGCCACTOT 1980 

65 eTGCGACTGC CRTGGCXaTQ TCGAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCXrEGTGCTS OGGaCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAAG CGGAAGATCA AGGAGCCCCT CCTACTCCCaV GAAGATGACA CCCGTGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CGAAGAGGAC CftGGACTATG ACATCACOa 2220 

GCTCCACCGA GGTCTGQAGG CCAGGCCGGA GGTGGTTCTC CGCRATGAOG TGQCACCAAC 2280 

70 CATCaTCCCG ACACCCaiGT ACCGTCCTCG GCCAGCXaAC CCAOATGAAA TOSaCllACTT 2340 

TATAATTGAG AACCTGAAGG CX3GCTAACAC AGACCOCACA OCCCOBCCCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC CGACGOOGCG TCCCTGAGCT CCCTCACCTC 2460 

CTCCGCCTOC GACCAAGACC AAGATTACGA TTATCTGAAC QAGTGGGGCA GCCGCTTCAA 2S20 

GAAGCIGGCA GACATGTACG GTGGCGGGGA GGACGACTAG GCGGCCTGCC TGCAGGGCTG 2580 

75 GGGACCAAAC GTCAGGOCAC AGAGCATCTC CAAGGGGTCT CAGTICCCCC TTCAGCTGAG 2640 

<»CrTGGGAG CTTGTCAQGA AGTGGCCGTA QCaACTrGGC GGAGACAGGC TATGAGTCrO 2700 

ACGTTAGACT GGTTGCTTCC TTAGCCTTTC AGQATQGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAG AGGC CAftG TT TCCAOAAGCX: 2820 

TCTTACCTGC CGTAAAATGC TCAACCXTTGr GTCCTGGOCC TCOGCCTGCT GTGACTGACC 2880 

80 TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTRGGCC TCCIG6TGCA ACTTAATTTT 2940 

TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAA6TGCA GCCCAOAGCT 3000 

GCTGGGCCCA CTGGCCGTCC TGCATTTCTO GTTTCCAGAC CCCAATGCCT CC CATT OGGA 3060 

TGGATCTCIO CGmTTATA CTGAGTGTGC CTAOGTTGCC CCXTATTTTT T ATTTTCCC T 3120 

GTTGOGTTGC TATAGATOAA GGGTOAGGAC AATOGTGTAT ATGTACTAGA ACTTTTTTAT 3180 

85 TAAAQAAACT TTTCCCAGAA AAAAA 

Seq ID NO: 59 Protein sequence: 
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Protein Accession 8: NP_001784.2 

1 11 21 31 41 51 

« I I I I I i 

D MGLPRGPLAS LLLLQVCWLQ CARSEPCSAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 

QEPALPSTDN DDFTVBNGET VQERRSLKER NPLKIFPSKR ILRRHKHDWV VAPISVPENG 120 

KGPFPQRLNQ LKSMKDRDTK IFYSITGPSA DSPPEGVFAV EKETGWLr.LM KPLDREEIAK 180 

YELFGHAVSE NGASVEDPMN ISIIVTDQND HKPKPTQDTP RGSVLEGVIiP GTSVMQVTAT 240 

DEDDAIYTYN GWAYSIHSQ BPKDPHDLMF TIHRSTSTIS VISSGIiDHEK VPEYTLTIQA 300 

10 TDMDGDGSTT TAVKWEIU} ANDNAPMFDP QKYEAHVCEN AVGEEVQRLT VTOUIAPIISP 3fi0 

AHBATYLIMQ QDDGDHFTIT THPESNQGIIi TTRKGLDFEA KNQBTLYVEV TNBAPFVUCL 420 

PTSTATIWH VEXnWEAPVF VPPSKWSVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 

DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNIYBVMVIiA MDN6SPPTTG TGTI,LI.TI.ID 540 

VNDHGPVPEP RQITICNQSP VRQVUJITDK DLSPHTSPFQ AQLTDDSDIY MTAEVWBBGD 600 

15 TWLSLKKFI. KQDTYDVHLS LSDHGNKEQl. TVIRATVCDC HGHVETCPGP WKGGFIIiPVL 660 

GAVU^LLFLI. LVIjI.LI,VRKK RKIKEPtLIiP EDDTHDNVFY YGEEGGGEED QOYDITQLHR. 720 

GLEARPEWI. RNDVAPTIIP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTLLW 780 

DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKIA DMYGQGEDD 

20 seq ID MO: 60 DHA sequence 

Nucleic Acid Acceseicn «: Eos sequence 
Coding sequence: 162-428 

1 11 21 31 41 51 

25 I 1 I I i I 

GOGTTCCGTT GGCGGCGGAT TCGAACGTTC GGACTGAGGT TTTTCTGCCT GAAGAAGCGT 60 

CATACXX5ACC GGATTGTTTT OSCTGGCCXA GTGTCCCCGG AGCTTGTGTG CGATACAGAG 120 

AGCACCTCGG AAGCTGAGGC AGCTGGTACT TGACAGAGAG GATGGCGCTG TCGACCATAG 180 

TCTCXXXGAa GAAOCAQATA AACC3GGAAG6 CTCCCC6TG6 CTTTCTAAAa CBAGTCTTCA 240 

30 AOCGAAAOAA GCX^TCAACTT CSTCTQOAGA AAAGTGGTGA CTTATTGGTC CATCIGAACT 300 

GTTTACTGTT TGTTCATCGA TTAGCAGAM3 AGTCCAGQAC AAAOGCTTGT GGGAaTAAAT 360 

GTAGAGTCAT TAACAAGGAG CaTGTACTGG CC6CAGCAAA GGTAATTCTA AAGAAGAGCA 420 

GAGGTTAGAA GTCAAAGAAC ATATTCTTGA AAGTTATGAT GCATTCTTTT GGQTGGTAAC 480 
_ _ AGATCATAAA GACATTTTTT ACACATCAGT TAATATGGGA TTATTAAATA TTQG 

35 



40 I I I I I I 

MALSTIVSQR KQIKRKAPRG FIiKRVFKRKK PQUII.EKSGO LLVHLNCLLF VHRLABESRT 60 
NACASKCRVI NKEHVIAAAK VILKKSRG 

Seq ID NO: 62 DNA sequence 
45 KUelelc Acid Acccsaion »: NM_000094.2 
Coding sequence: 99-8933 

1 11 21 31 41 51 

<A I I t 1 ' I 

50 GGGCTGGAGG GGCGCTGGGC TCQQACCTGC CAAGGCCACC GCAGGGGGGA GCAAGGGACA 60 

GAGGOGGGGa TCCTAGCTOA CGGCTTTTAC TGCCTAGQAT 6AC6CTGCGG CTTCTGGTGG 120 

CCGCGCTCTG CGCOOGGATC CIGQCAGAGG CGCCCCGAOT GOGAGCCCAG CAC3U3GGA6A 180 

GAGTGACCTG CAOGCGCCTT TAOGCOGCTG ACATTGTCTT CTTACTOGAT GGCTCCTCAT 240 

CCATTGGCCG CAGCAATTTC OGCGAGGTCC GCAGCTTTCT CGAAGGGCTG GTGCTGCCTT 300 

55 TCTCTGOAGC AGCCAGTGCA CAGGGTGTGC GCTTTGCCAC AGTGCAGTAC AGOGATGACC 360 

CACGGACAGA GTTCGGCCTG GATGCACTTG GCTCTGGGGG TGATGTGATC CGCGCCATCC 420 

GTGAGCTTAG CTACAAGGGG GGCAACACTC GCACRGGGGC TGCAATTCTC CATGTGGCTG 480 

ACCATGTCTT CCTQCCCCAG CTGGCCCQAC CTGGTGTCCC CAAGGTCTGC ATCCTGATCA 540 

CBGACGGGAA GTCCCAGGAC CTGGTGGACA CAGCTGOCCA AAGGCTGAAG GGGCAGGGGG 600 

60 TCAAOCIATT TGCIGIGGOG ATCAAGAATG CTGACCCTGA GGAGCTGAAG OQAGTTGCCT 660 

CACAGCCAAC CtCaSKCTTC TTCTTCTTCG TCAAX6ACTT CAGCATCTTO AGQACACTAC 720 

TOCCCCTOBT TTCCCGQAOA GICTGC&CGA CT6CTGGTGG CGTGCCTGTG ACCCQACCTC 780 

OGOATOACTC GACCTCTGCT CCACGAOACC TOGTGCTGTC TGAGCCAAGC AGCCAATCCT 840 

TGAGAGTACA GTGGACAGCG GCCAGTGGCC CTGTGACTGQ CTACAAGGTC CAGTACACTC 900 

65 CTCTGACGGG GCTGGGACAG CCACTGCCGA GTGAGCGGCA GGAGGTGAAC GTCCCAGCTG 960 

GTGAGACCAQ TGTGCGGCTG OGGGGTCrCC GGCCACTGAC CGAGTACCAA GTGACTGTGA 1020 

TTGCCXrrCTA CGCCAACAGC ATCGGGGAGG CTGTCAGCGG GACAGCTCGG ACCACTGCCC 1080 

TAGAAOGGCC GGAACTGACC ATCCAGAATA CCACAGCCCA CAGCCTCCTG GTGGCCTGGC 1140 

GGMSTGTGCC AGGIGCCACT GOCTACCGTG TGACATGGGG aOTGCTCAGT GGTGG6CCCA 1200 

70 CACAGCAGCA GGAGCIQaGC CCTGGGCAG6 GTTCAGTOTT GCTGCGTGAC TTGGAGCCTG 1260 

aCAOQOACTA TOAGGTQACC GKUGCACCC TATTTGGCOG CAGTOTGGGQ COOGCCACTT 1320 

CCCTGATGGC TCGCACTQAC aCTTCTGTTQ AGCAQACCCT GCGCCCGGTC ATCCTGGGCC 1380 

CCACATCCAT CCTC*TTTCC TGGAACTTGG TGCCTGAGGC CCGTGGCTAC OGGTTGGAAT 1440 

GGCQQCQTGA QACTGGCTTQ GAGCCACCGC AGAAGGTGGT ACTGCCCTCT GATQTGACCC ISOO 

75 GCTACCAGTT GGATGGGCTG CAGCCGGGCA CTGAGTACCG CCTCACACTC TACACTCTGC 1560 

TGGAGGGCCA CGAGGTGGCC ACCCCTGCAA CCGTGGTTCC CACTGGACCA GAGCTGCCTG 1620 

TQAGCCCT6T AACAGACCTG CAAGCCACCG AGCTGCCCGQ GCAGCQGGTG CGAGTGTCCT 1680 

GQAGCCCAGT CCCTGGTQCC ACCCASTACC GCATCATTGT GCGCAGCACC CAGGGGOTTa 1740 

_ - AOCOGACCCT GGTGCTTCCT GGQAGTCRGA CAGCaTTCQA CTrGGATQAC OTTCAGGCTO 1800 

80 GGCTTAGCTA CACTGIGOGG GTCTCT G CTC GAOTGOSTCC CCGTQAGGGC A0I6CCAGTQ 1860 

TCCTCACTGT CCGCCGGGAG CCGGAAACTC CACTTGCIGT TCCAOGGCTG CGGGTTGTGG 1920 

T6TCAGATGC AAOGCGAGTG AGGGTGGCCT GGGGACCC6T CCCTGGAGCC AGTCGATTTC 1980 

GGATTAGCTG GAGCACAGGC AGTGGTCCGG AGTCCAGCCA GACACTGCCC CCAGACTCTA 2040 

CTGCCACAGA CATCACAGGG CTGCAGCCTG GAACCACCTA CCAGGTGGCT GTGTCGGTAC 2100 

85 TGCGAGGCAG AGAGGAGGGC CCTGCTGCAG TCATCGTGGC TCGAACGGAC CCACTGGGCC 2160 

CAGTGAGGAC GGTCCATGTG ACTCAGGCCA GCAGCTCATC TOTCACCRTT ACCTGGACCA 2220 

GGGTTCCTGG CGCCACAGGA ZACAGGGITT CCTGGCACTC AGCCCACGGC CCAGAGAAAT 2280 
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CCCRGTTGGT TTCTGGGGAQ 
AGTATACGGT GCATGTGAGG 
TTQTGAGGAC TGCCCCTQAG 
CCAGCGACX3T TCTACGGATC 
CCTGGGGCOG GAGTGAAGGC 
CTGCAGAGAT CCGGSGTCTC 
TCGGGGRCCG CGAGGGCACR 
CAGCCCTGGG GACGCTTCAC 
AGCCGGTGCC CAOAGCGCAG 
AGTCCCGGGT CCTGGGGCCC 



ACCTGGGTAG 
GGCCCCATGA 
GAAGGTGGAG 
CCTGTCTCCA 



C GTGTGTCGAG 



CGATOGACTC GGTGACTTTG 
CCTGGCGGCC ACTCAGAGGC 
GGATCTCAAG CTCCCAGCGG 
TOACXSCCTGT CCTGGATGGT 

GcccccxrrGG cctggcggat 



TGAAISaCTC 
ACOCAAGTGG 
CAGATQCTCC 
CCTTGAGAGG 



QGCTTCCTTC 
QAGCTC3U3CA 
AGTGTCXTrAO 

CACTOnaTCA cctcbtgttc caagcatiga actagqtgto 



GGCACCAGAT 
TCAGCTACTC 
TTGTTGTCAC 
GCGGGGAGCA 
TGCACTGGCA 
GCTATCACCT 



\ TG6ACTGGAQ 
TGGGCCCCCT 
GCTGCAGATC 
AGCCACAGCT 
ACTCCCAGGA 
AGTGCGAGTG 
TACGCCGCCT 
CrCGCTGAGG 
ACCTGAGGGT 
GGACGGGCTG 



CCAGATACCG 
GOCTCIGTGG 
CTCAATGCTT 
TACAQACTGG 
AACACAGACT 
ACTGCACTTG 2640 



PCTAJS02/12476 



GAGQCTCCSC 
CTGCGCTGGG 
GGCCAGGAAC 
GAGCCAGC6A 
TCTGCAGAGG 



ACTCTGTCCA 
GTCTGGCCAC 
CAQTGTATTG 
TTGGGCCTCC 
CCCCTGGAAG 
GCAOCCCTGQ 
CAGGGTTGCC 
AGCCGGGGGC 
GGGACCCTGG 
GTGGCCCCCC 
GGGGTCCCCC 
TTCCCQGAAG 
GTGACTCT6A 



TGACATATTC 
AATGGCTGGA 
QACCTTCTTC 
AGCXXTGTGT 
TCCAAAGGGC 
TGGCGACCCT 



GCCGTGGATG 
CAGGCATCCT 
CAOAAGGGGG 



CAGTGTCCAG 
AAGTCCCTGG 
TAGAGCCTGG 
CTGAGGCATC 
TACCACATGC 
GTCTGGTGTT 
ACAGTCATCG 
TGCAAAGGAT 
TGGTCACAGC 
CAGGGGTQAT 
GTGAGGCCCA 
AGCAGCTGCG 
ATGGGCCAAG 
TCACTACTCA 
AACCTGGAGA 



CACTCAAQAC 
GGCACTTGGG 
GCCCTCCCCA 



TACATCCTAT 
ACACTTCCAG 
ATCTTCTCCC 
AOrcCAGTGT 
AATGCTCACC 
CCTCrTGGGC 
CTBITCCCAC 
CCCTACATGG 



TCGCTTGGOO CCGGGTATGG 
CCTCGACCAG GCAGTCAGTG 

GccaxaccA gagccctgcc 

GATGGGCCTG AGAGGACAAG 
TGCrCCCGGC CCCCAGGGGC 



CCCTGGACCC 
G6ATGGAGCT 
ACCTCCTGQA 



T6CCAGGG6T TGCTGGACGT 



CTGTTGCTGG 
CCGGAGTCCA 
AGGGAGACCC 
ACTCAGGGCC 



CGGGTTTGCC 
CTGTGGGTGA 
GATCATCTGG 
TGGTAGACAC 
GTCCTCGAGG 
AAGGGTTTCG 



AGGGGAACGG 
TGGAGACCGG 
TCCTGGAGAG 
ACGAGATGGT 
TGGAAAAGCA 
AAAGGGAGAC 
ACCCAAGGGT 
AGGACCTGGA 
GCCCAAGGGT 



GTCATCGGAG 
CCCCCTGGAC 
GGAACAGCCA 
GAAGGTGGCA 
CAAGGCX:CCG 
CCAGGCCTCC 
GCTATTGGCC 
6QOGAACX3TG 
CCTGGAGCCa 
GAGCCTGGTC 
GAAAAGGGAG 



GAGCCGCTGG 
ACGGGCTTCC 
TACCGGOAAA 
GGGACCCTGG 
AAGGTOGTGA 
CTCCAGGCCT 



CCGGGGTCXTC 
GCCCTCTGGG 
AGGCCTCCGT 
GCCAGGCGAG 
AGAAGACGGG 
TGGCXXCAAO 
CCCAG6GCCA 



GTGAAGGACC 
CTOGTGGACC 
TGAAGGGTGA 
TTGCTCCTGG 
TTGGCCCCCC 
CAGGACAACC 
CCAAAGGTGA 
GACCCCCAGG 
AGGGTCCTGA 
GCCCTGGGGA 
ATQTGGGOCC 
GCTTGGTTCT 
aGTCCCATTG GCCTTACTGG 
AAGGGAGACC CTOGGCXncC 
GAAGTTGGAO AGAAAGGTGA 
GCCTTCGGGG 
CTGGAGAGGA 
AGCOGQGTCC 

GCCAGAGAGA 
GATCCTGGCC 
GGCCCACAGG 
CCTGGGCrGG ATGGCOC3aA< 
CCGAATGGTG 
GGAGAACAAG 
GATGGGAAAC 
AGGAAGGGAQ 



ACCTOGAGOC 
TGGGCTTCCT 
ACTGGGGGAC 
CAAAGGCGAT 
GGAGCCTGGG 
TGGAAAGAAA 
TCGOTCTCCG 

ccGGGGcrrr 

CCCAGCGGGA 
AGGGCCACCA 
CCCTGCAGTG 



AAGGGCTCTC 
CXAAAG6GGG 
GGGCGGAAAG 
CCAGGACCCC 



CTGCCGGGTC 
GGAGAAAAAG 
GGTQAGCAGG 



TCCCGGGGGC 
GGACCCACTG 
6TGGGACCTG 



CAGGGAGATC 



TGGCCCCCCA GGACCTGTTQ 
CGAGGGTCCT CCGGGTGACC 
GGCACCTGGA GTTCGGGGGC 
TGGACX3AAAT GGCAGCCCTG 
CCCAGGACCC CCGGGACGGC 
TGQGGACCGC GGACAAGAGG 
CCCTGOGOAA AGGGQCATTQ 



GTGGGCCCTC 



CTGC3«3GCAA AGCTOQGGAC CCASGGAOAO 
GCCTCCCTGG CCCCTCTGGT CCCCCTGGAT 
CTGGCCTGAA TGGAAAAAAC GGAGAACCTG 
AGAAAGGAGA TTCAGGCGCC TCTGGQAGAG 
GAGCrCCTGG TATCCTTGGA CCCCAGGGGC 
CTGGCCAGGG TTTTCCTGGT GTCCCAQGAG 



: TGGCCCTCAG 
GGCCTTCCGQ CCTTGCCGGG 
CTGGGGCIGT GGGA6A6GCA 



GACCCCCTGQ CCCCAAGGTQ 



QAOCCrGGAA aTOTGCCaAA TGRGQATCQG TTGCrGGAAA 
QCCCIGOGGG AGATCCrrGGA GACCTGGGAT GAGASCTCCQ 
GAACQGOGTC GAGQCCCCAA GGGOGACTCA GGCIGAACAGG 
CCCATOGGCT TTCCTGGAGA ACGOQGGCTG AAGGGCGACC 
GGGCCACCTG GTCTGGCCCT TGGGGAGAGG GGCCCCCCCG 
GAGCCTGGAA AGCCTGGTAT TCCCGGGCTC CCAGGCAGGG 
GGAAGGCCAG GAGAGAQGGG AGAACGGGGA OAGAAAGGAG 
GATQGCCCTC CTGGACTCCC TGOAACCCCT GGCCCCCCCG 
TCTGTGGATG AGCCAGOTCC TGOACTCTCT GGAGAACAGG 



AAeCMOACAG GGGTGTGCCA C 
CCOGGGTCTA 
TCCAAGAGGC 
GGGTCTTGCT 
GACAGGACCT 
CCCCGGCCCT 
GGAGACAGQG 



3 AGAGCCTGGA CCGAGGGQTC 



GTCTGCAGGG 
CTGGTGCCCC 
AGCCTGGAGA 
TTCCTGGACC 
QACAAGTGGG 



CCCCCTGGCC CAGTGGGTGG 



CCAGGAOGQG 
TCAGGCCTTG 
AAGCCGGGAG 



GCCTGACTOa 
TGGGTCCACA 
CCCCAGGTCG 



TGGGCCTGAA 
TCATGGAGAC 
ACCTTCTGGC 
ACCTACIGGA 
GGGGTCTCCA 
AGATGGTGCC 
TCTGCCTGGC 



CCTGGACCAC 
CTGAAGGGGG 
GCTGTGGGAC 
GGTTTGCCTG 
AGTOGAAAAG 
OCTGTCXSQAC 



CAAAaGGAOA GAAGGGAGCC CCTGGAGGCC 



CCAAAGGTGA 



AGGGTGACCC 
AGGGAGATCT 
AGACAGGCCC 



COGAGGACT6 
AOACCCTGGG 
AGGAGTCGGG 



TGGQACCACC 



GGGCCTCCCT 
TCGAGGAGAG 
6AGAGAAGGA 
TCGGGCCTCT 



GAAGATGGTC 
GTCCCGGGCT 
GGCCTGCCCG 
ATGGGTCAGC C 
ATCCCAGGAC C 
GQACTCAAA6 C 



TTGCTGQAGA 
GAGGCGAGAA 
AGAAAGGGGC 
CCCCTGGGCC 
GTGCTCCTGG 



GQCCQTGCAG 
AAAGGTTTCA 
CCAGGTGTGA 
TTCCOGGGTC 



2700 
2760 
2820 
2880 



3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 



4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 



5280 
5340 
5400 
5460 
S5Z0 
5580 
5640 
5700 
5760 

ssao 

5940 
6000 
6060 

6180 
6240 
6300 
6360 
6430 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7030 
7080 
7140 
7200 
7260 



211 



20 
25 
30 



WO 02/086443 

CTGGQCCXXXS AGGCSSAGOGT 
AGGAGGQ&CC CCGAGQACTC 
GTGATGTTGG GAGTGCAGGA 
CTCCAGGCCC ACGGGGTGCC 
GTGACAAAQO ACCTCGGGGA 
CTGGTGACAA GGGCTCAGCC 
AACCTGGTC3C AGCAGGGA.TC 
GTATCCGACG AGAAAAAGGA 
GGGGAGTGAA GGGAGCCTGT 
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AOGGGGCCCC CIOGCAGCAG GGGAGAGCST 
CTAAAGGQTG ACAAGGGAGA CTCAGCIGTO ATCCTGGGGC 
AA6G6GGACA TGGOTGAAC9 AGGGOCTCGO GGCTTGGATG 
GACAATGGGG ACCCTGGTGA CAAGGGCAGC AAGGGAGAGC 
GGGTTGCCAG GACTGCGTGG ACTCCTGGGA CCX:CAGGGTC 
CCTGGTGACC CGGGATCCCC AGGAAAGGAT GGAGTGCCTG 
GATGTTGGCT TCATGGGTCC CCXSGGGCCTC AAGGGTGAAC 
GGCCTTQATG QAGAGAAGGG AGACAAGGGA GAAGCTGGTC 
GCAGGACACA AAGGAGAGAT GGGGGAGCCT GBXGTGCCGG 
AAGGAGGGCC TCATCX3GTCC CAAGGOTGAC OQAG6CTTTG 



COGGAGAGAG 
AGGGGCX3GCC 
ATQACATCCG 
TCATCGCATC 
TOCATGCTGT 



AGGGCCTGCC 
QGGCTTTGTG 
TGGATCACGA 
QCCTGTGCTC 
QTACTCTGAA 



AGTGGAAATG 
CCXX3AAGGAC 
GCTCCTGGGG 
GGTCCTOGAG 
CGCCAAGAGA 
CCCCTCCCTA 
CGCGTCTCTC 



ATGOCTCTGC 
TTCAGGGCCA 
TCCCTGGAGC 
GCGAGAAGGQ 
TQAGTCAGCA 
GTTATGCTGC 
ATGCAGAG6A 
ATTCTCIGSA 



GAAGGGTGAG CGAGGTCCCC 
TCCTGGCGAG AGAGGGGAGC 
AGAAGCTGCA CTQACGGAGG 
CTGTGCCTGC CAQGGCCAGT 
AGACACTOCC GGCTCCCAQC 
GGAAGAGCGG GTACCCCCTG 



ACACCCTGCQ CTOGTACCAT 
TCTATGGTGG CTGTGGAGGQ 
GCTGCCCACX: CCGGGTGGTC 
ATAATGAGCT GAGATTCAGC 
CCCTCCCCTT GGTGCTAGAG 
TCAGTGACTT Q6TCCCX3TGG 
CCTGCCACCC TGGCAGATGA 
ACTGGCXSTCT GACCCGCCCC 
GCATTAAAGC TGCTGTTTTA 



ATCCCCTGGA 
GCTTGTGTGC 
GTCTAGCCTT 
CTCACTGTGO 
TTGACCCAAG 
AAAQGCAAAA 



CAGGCAGCAC 
GTTTTGGGAC 
GGACAGGTAC 



TCAlGGGCTCC TGCACTGCCT 
AQAGGCCTGT CACCCrrTTG 
CCXJTOAGQCC TGCGAGCGCC 
TGCCXaCGAC TGAGGCCCAG 
TCTCAGCaGA ACCCCACTGT 
GCGAGTGCAC GTCCQTTATT 
OACAAACCCC CATTGTGGCT 
QTGGGCAaTG AGCGGATGTG 
CATG6TGCIG ATTCTGGGG6 



7740 
7800 
7860 
7920 
7980 
8040 
8100 

8220 
8280 
8340 
8400 
8460 
8520 
8580 
8£40 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 



40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



LEGLVLFFSG 
AAILHVAOKV 
EEIiKRVASQP 



TEYQVTVIAIi 
RVLSGGPTQQ 
LRPVILGPTS 
RI,TLYTtLEG 
VRSTQGVERT 
VPQUWWSD 
YQVAVSVLRG 
SAKGFEKSQIi 
RLQILNASSD 
SVRVTAIiVQD 



I 

CAGIIAEAPR 
AASAQGVRFA 
FIiFQUUiPSV 
TSDFFPFVND 
QWTAASGPVT 
YANSIGEAVS 
QELGPGQGSV 
IIjLSWNIjVPE 
HEVATPATW 
LVLPGSQTAP 
ATRVRVAWGP 
REEGPAAVIV 
VSGEATVAEI. 
VIiRITWVGVT 
SEGTPVSIW 



21 
I 

VBAQHHERVT 
TVQYSDDPRT 
PKVCIIiITDQ 
FSILRTLLPIi 
GYKVQYTPLT 
GTARTTAr.EG 

llrdlepgtd' 
argvrlewrr 

PTGPELPVSP 
DLDDVQAGLS 



31 

I 

CTRIiYAADIV 
BFGLOAIXSSG 
KSQDLVDTAA 
VSRRVCTTAG 
GLGQPLPSER 
PELTIQKTTA 
YEVrVSTIiFG 
ETGLEPPQKV 
VTDLQATELP 
YTVRVSARVG 



TTPPBAPPAIi CTI>HWORGB 



SVTUVMTPVS 
VLDGVHGPEA 
VQVGLMYSH 
PGRRQHVPGV 
QTFFAVDDGP 
PGDPGLPGRT 
APGUCGSPGL PGPRGDPGBR 



EIiRWDTSID 
GVSYIPSITP 
LALGP1.GPQA 
AHRYMLAPDA 
RRLAPGMDSV 



RASSYItiSMR 
SVTQTPVCPR GLADWFLPH 
RFSFI.PPU3G SHDI<GIILQR 
MVIiLVDEPLR GDIFSPIREA 
SLDQAVSGLA TALCQASFTT 
GAPGPQGPPG SATAKGER6F 
GPRGPKGEPG APGQVIGGGG 



41 
I 

FLLDGSSSIG 
GDVIRAIREL 
QRIiKGQGVKL 
GVPVTRPPDD 
QEVNVPAGET 
HSLLVAHRSV 
RSVGPATSLK 
VLPSDVTRYO 
GQRVRVSMSP 
PREGSASVI.T 
QTLPPDSTAT 
SVTITWTRVP 
DGPPASVWR 
ILPCariDSAB 
HSIiRUtHBPV 
GEGPSAEVTA 



PGATGYRVTW 
ARTDASVEQT 
IJJGLOFGTEY 
VPGATQYRII 
VRREPETPLA 
DITGIiQPGTT 
GATGYRVSWH 
TAPEPVGKVS 
IRQLEGGVSY 
PSAQGFUiHW 
RTBSPRVPSI 



QASGLHWrO. 
QPRPEPCPVY 
PGAD8RP6SP 



ATRRVLBRLV 
GNHLGTAWT 
GMAOAOPEQL 



PLG0P6PRGP PGLPGTAMKG OKODRGERGP PGPGEGGIAP GEPGIiPGLPG SPGPQGPVGP 



RGPPGFQGDP 
PGIAGEQGI.P 
DGPKGERGAP 



DEGPPGDPGI. PaKAGERQLR QAPGVRGFVG ERQDQ(X3PGB 
PPGPPGRLVD TGPQAREKGE PGORGQEGPR 6PKGDP6LPG 
GVRGPAGEKG DRGPPGIiDGR SGLD6KFGAA 6PSQPN6AAG 
GPSGPPGIiPG KFGEDGKPGL NGKKGEPCTP GEDGRKGEKG 
OILGPQGPPQ LPGPVGPPGQ GFPGVPG6TG PKGDRGBTGS 
NVDRUfTAO IKASAI.SEIV BTHDBSSGSF LPVPERRROP 
P6PQGPPQLA 



AGPEGKPGLQ 6PRGPPGPVG GHGDPGPPGA PGUU3PAGPQ GPSGLKGEPG EIGPPGRGLT 
GPTGAVGLPG PPGPSGLVGP QGSPGLPGQV GETOKPGAPa RDGASGKDGD RQSP6VPGSP 
GliPGPVGPKG EPGPTGAPGQ AWGLPGAKG EKGAPGQLAG DLVGEPGAKQ DRGLPGPRGE 
KGBAGRAGBP GDPGEDGQKG APGPKGFKGD PGVOVPGSPG PPGPPGVKGD LGLPGLPGAP 
GWGPPGQTQ PRGEMGQPGP SGERQIAQPP <SIEQ1PGPI-G PPGPPGSVGP PGASGLKGDK 
aOPGVGLPGP RGSRGSPGIR GEDGRPGQGG PRGIiTGPPGS RGESGBKGDV GSAGUCGDKG 
DSAVIIX3PPG PSGARGOKGB ROPRGUIGOK OPRGDNSJPG DKGSKGEPGD KGSAGbPGIA 
GLLOPQOaPO AAOIPGOPGS FQKDSVFGIR OEKGDVGFMG PRGIdCQESOV KBACGLOGEK 
(SKGEAGPFG RPGUUaOBSE MGB9GVPGQS GAPOKE6LIG PXaDRGFDGQ VfSBKCDQfSEK. 
GERGTPGIGG FPQPSQOXSS AOPPGPFGSV GPRGPBGLQQ QKGEROPFGB RWGAPGVPQ 
APOERGBQGR PGPAGFBGBK GEAALTEDDI ROFVRQEHSQ HCACQGQFIA SQSRPLFSYA 
AQTAGSQLHA VPVLRVSBAE EBBRVPPEDD BYSSXSEYSV EEYQDPEAPN 
DBGSCTAYTL RWYHRAVTSS TBACHPFVYa GOGGNANRFe 
TAQD 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



2280 
2340 
2400 
2460 

2580 

2640 

2700 

2760 - 

2820 

2880 

2940 



212 



10 
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Seq ID NO: 64 DHA sequence 
Nucleic Acid Acceoaion «t MM_oa694S 
Coding sequence: 1^2 19 

1 11 21 31 41 51 

i I I I I I 

ATGTCTTATC AACAGCAGCA GTGCAAGCAQ CCCTGCX^AGC CACCTCCTGT GTGCCCCACC3 
CCAAAGTGCC CAGAGCCATG TCCACCCCCG AACTGCCCTG AGCCCTGCCC ACCACCAAAG 
TGTCCACAGC CCTGCCCACC TCAGCAGTGC CA6CAGAAAT ATCCTCCTGT GACACCTTCC 
CCaCCCTGCC AGCCAAAGTA TCCACXX5AAG AGCAAGTAA 
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20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



I I I I I i 

MSYQQQQCKQ PCQPPPVCPT PKCPKPCPPP KCPEPCPPPK CPQPCPPQQC QQlOfPPVTPS 
PPCQPKYPPK SK 

Seq ID NO: 66 DNA sequence 

Nucleic Acid Accession ft: NM_00SS29.1 

Coding sequence: 639-2546 



I 

TAGTCGGAGC 
CCGCCGCCGQ 
COGCCGCCAC 
CGGGGCX3GGG 
CCGATGTCGC 



1 



I 



CCTCGGQQCC 
GCOGCGACCC 
ACGGCATCTA 



GAGGTGGCGA GTCGCTOAGC CCGCCGCGGC CCCGAGAGCG GCTGCAGCCG 
GAAGGAGAGG GCGAGGOGCXS COCGAGCCGC CGCGGCCGCC GCCACCGCCG 
CACCGCCACC GGAGTCGCGO GCCaGCOOGG C ACCCT CCBC (MOCCCOOaC 
GGCGCGGGCC ACAGGCCCCT GCTCOGGCCG TCeTTTGCAG ACCGCXaGGOG 
CCGCGCCCCG TTAGGATQAQ TCrCGOOTOG GGCGAGGAGC CQCCGCAGCC 
GAGCCGOGGG CAGGAGCCTC GGGA6CCGCC QCCGCOGOCQ CCGCOQCCCO 
GACGCCGCCC GCGCGCCCCC GGGCCCCCGA CACACATGAG ATTCTTCAGG 
AGTGCTTCGT GGACTGCTTC TGACTGCGCC GCCCGCGCCC CGCACCCCGC 
CCGCCCCGTC CCCCGGCCCG GCCGCCCCCC GGCCCCCGGC CGGCCCGCGC 
CTCCCCGGTG CCGCOGGTGC CCCCCGCCTG ACCGCCGCCC CCCGTGAGGC 
CGGCCCGGCC GTGCQGCCCQ CCGGGGCCAT OQCGAAQAAG ^^^^^^^^^ 
TAOCGTGTCC GGOGACGAGA * 



CCGTGCCGCC GCGCOAGACC TGOAOGCGCC AGATGGACTT CATCATGTCG 
TCGCCGTGGG CTTGOGCAAC GTGTGGCGCT TCCCCTACCT GTGCTACAAG 
GTGTGTTCCT TATTCCCTAC GTCCTGATCG CCCTGQTTGQ AGGAATCCCC 
TAGAGATCTC GCTGGGCCAG TTCATGAAGG COGGCAGCAT CAATGTCTQG 
CCCTGTTCAA AGGCCTGGGC TACGCCTCCA TGGTGATOGT CTT CTAC TGC 
ACATCATGGT GCTGGCCTGG GGCTTCTATT ACCTGOTCAA GTCCTTTACC 
CCTGGGCC»C ATGTOGCCAC ACCTGGAACA CTCCCGACT6 CGTGGAGATC 
AAQACTOTaC CAATGCCAGC CTGGCCAACC TCACCIQTGA OMOCTTGCT 
CCCCTGTCRT OGAGTTCTGa OAGAACAAAQ TCTTOAGGCI OTCTaO GGaA 
CAGGGGCCCT CAACTGGGAG GTGACCCTTT GTCTGCTGGC CTGCTQGGTG 
TCTGTOTCTG GAAGGGGGTC AAATCCAOSG QAAAGATOGT (STACTTCACT 
CCTACGTGGT CCTGGTCGTG CTGCXGGTGC GTGGAGTGCT GCTCCCTGGC 
GCATCATTTA CTATCTCAAG CCTGACTGGT CAAAGCTGGG GTCCCCTCAG 
ATGCGGGGAC CCAGATTTTC TTTTCTTACQ CCATTGGCCT GGGGGCCCTC 
GCAQCTACAA CCGCTTCAAC AACAACTGCT ACAAGGACGC CATCATCCTG 
ACAGTGGGAC CAGCTTCTTT GCTGGCTTCG TGGTCTTCTC CATCCTGGGC 
CAGAGCAGGO CGTGCACATC TCCAAGOTOQ CAGASTCAGO GCCGGOCCTG 
CCTACCCQCQ GGCTGTCACa CTGATCCCAO TGOCCCCACT CTGGGCTGCC 
TCAmjCTGTT GCTOCTTGOT CTCGACAGCC W3TTTGTAGG TGTGGAGGGC 
GOCTCCTCGA CCTCCTCCCG GCCTCCTACT ACTTCOOTTT CCAAAGGGAG 
CCCTCTGTTG TGCCCTCTGC TTXGTCATOG ATCTCTCCAT GGTGACTGAT 
ACGTCTTCCA GCTOTTTGAC TACTACTCQG CCAGCGGCAC CACCCTGCTC 
TTTGGGAGTG CQTGGTGGTG GCCTGGGTGT ACGGAGCTGA COGCTTCATG 
CCTGTATGAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC 
CGCTGGTCTG CATGGGCATC TTCATCTTCA ACGTTaTOTA CTACOAGCCG 
ACAACACCTA CGTGTACCCG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC 
CCATOCTSTQ CSSTQCCGCTO CACCTCCTGG GCTGCCTCCT CAGGGCCAAG 
CTGAGOSCTG GCAOCAOCTG ACCCAOCCCA TCTGGGGCCT CCACCACTTG 
CTCA6GACGC AGATGTCAGG GGCCTGACCA CCCTGACCCC AGTOTCOGAG 
TCGTCGTGQT GGAGAGTGTC ATCTGACAAC TCAGCTCACA TCACCAGCTC 
GCCATAGCAG CCCCTGCTTC AGCCCCACCG CACCCCTCCA GGGGGCCIGC 
CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT 
ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA 



TGCOIGGGCT B40 



ATTTTCTTCT 
AACATCTGTC 
AACACCTACT 
ACCACGCTGC 
TTCCGCCATG 



CTGGAGGTGC 
CTGGTCTACT 
GCTACATTCC 
OCCCTOGATG 
GTGTGGATAG 
ACAGCCCTGG 
GCTCTCATCA 
TTCATGGCTG 
GCCTTCATCG 
CTGTTCTTCT 
TTCATCACCG 
ATCTCTGTGG 



TTCTTCAOCC 
CTGGTCTACA 
GCCCTOTCCT 
GGCACCATCG 
6AGTACCGAG 
AGCAGCAAGG 
ACCTCTGGTA 
CTTTCCCTGA 



TCCCCCTCCA OCCCTAGCCG AGCTGGTCCT AGGCCCCX3CC TAGTGCCCCA 
CAGTGCTGCA CTCCTCCTGC CCCTGCCACG CCCACCCCCT GCCCACCTCT 
CrCTGCAQCA CACCCOTGGQ TGACCCCTCA CCCCAGAAGC AGCAGTGQCA 
TGTOAGGAAG GGAAGGAGGO AGAGACGGGA GGGAGGAGA6 AGAGGAGAAG 
GAGQGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC 
TTATAGAAGC TTAGAGAGCC AGCCAGCAAT GQAACCTTCT GGTTCCTGCG 
CCAGTATCAA TTGTGTGAGC TTGGGTGOGA GTOCACGCGT QCGTGAGTAC 
TATAQATCTC TATCTCTTAG CAAAGGTGAA TGCC3U3ATGT AAATQSOSOC 
GGAGGCTTGT ATTTTGCACA TTXTATAAAA ACTTaAGAGA ATGAQATTTC 
TTTCTAAAAA GAQGAAGGAG CCCAAACCAT CCTCrCCTTA CCACTCCCAT 
CCTACCTTAC CCCTCTGCCC CTAGCCAAGG AGTGTGAATT TATAGATCTA 
GCAAAACAAA AGCTTOGAQC TGTTGOGTGT GTGAGTCTGT TGTST6GAT6 
GTOCCCAGCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTCGGGG 
GCTGTCCCTT TGCCACAAGT CTQTGGGGCA AOAGGCTGCA ATATTCCGTC 
GGGCTGCTAA CCTGGCCTGC TCAGGCTTCC CACCCTGTGC GGGGCACACC 
OACCCTGGAC ACGGCTCCCA OGTCCAGGCT TAAGGTGGAT GCACTTCCOB 



CCCCCACCCA 



GCTTGGGAAA 
GGAGGCAGGG 
CCCATCCCTG 
CCAATCGCCA 
GQAGAGTATA 



OCCTGTGASC 
ACTTTCATAG 
TGCGTGTGTG 
CTGTCCCCAC 
CIGGGTGTCT 



1260 
1320 
1380 

1500 
1560 
liS20 
1680 



2100 
2160 
2220 
2280 
2340 

2460 
2520 
2580 
3640 
2700 
2760 
2820 
2880 

3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 



213 
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CTTCTGTGTA GCAGCTTTAA CCCACGTTTG TCTGTCACGT CCAOTCCCQA. C 
TGACCCCAAS AAAGBCTTCC CCGACACCCA GACAGAGQCT GCAGGGCTGQ GGCTGGGT6A 3840 



PCT/US02/12476 



10 
15 
20 



1 



MAKKSAEHGI YSVSC 



FIMSCVQFAV 
IHVWNICPLF 
CVBIFRHEDC 
ACWVLVYFCV 
GSPQVWIDAG 
SILGFMAAEQ 
GVEGFITGLL 
TTIJjWQAFMB 
YYEFLVmHT 
LHRLEYRAQD 



GliOIVHRFFY 
KGLGVASMVI 
AHASUUre>TC 
WKGVKSTGKI 
TQIFFSYAIG 
GVHISKVAES 
DLLPASYYFR 
CWVAMVWSA 
YVYPWWGEAM 
AOVRGLTTI.T 



PLIAFGPDGA 
IiCYKHGGOVF 
VFVCNTYYIM 
DQLADRRSPV 



LGALTAI<GS¥ 
GPGIiAFIAYP 
FQREI5VALC 
DRFMDDIACM 



LIFYVIiIALV 
VLAHGFYYLV 

VIjWIiVRGV 
NRFNNNCYKD 
RAVTLMPVAP 
CALCFVIDL5 
IGYRFCPWMK 
CVFIiHLUSCL 



[ SLGQFMKAGS 120 



tSCGLEVEGA 
IJ.PGAU3GII 
AIILALINSG 
IiWAAIiFFFML 



LMWEVTLCU. 
YYLKPDWSKI. 
TSFFAGFWF 
LLLQLDSQFV 
OtPDYVSASG 
CMOIFIFHW 
HQRLTQPIVIG 



25 
30 
35 
40 
45 
50 
55 
60 
65 



85 



Seq IO NOi 68 DNA sequence 

Nucleic Acid Accession «: NM_021953.1 

Coding sequeoce: 178-2469 



CAGTCTGGAG 
AAAGCTAGCC 
AATGCCCCAA 
AATCAAGCAG 
ATCAAGATTA 
GCTAATATTC 



GGTCCACACT 
CCCGTCGQCC 
GTGAAACATC 
AGGCCTCCAA 
TTAACCACCC 



GCCCGCCTTC 
TGTQATTCTC 
ACTGATTCTC 
AGAGGAGGAA 
GGAAGTGGCG 
CACCATGCCC 



GOACCAAAAC 
CAGAAACGGG 
TCCAACATCC 



CCTTCGAGAC 
ATGGCCATGft 
ATCTATACGT 
AAGAACTCCA 
AATGGCAAGG 
CAGGTOTTTA 
CAGAAACGAC 



CTGCAGCTAG 
AGACCTGTGC 
AGTGGCrrCG 
AGGAAAAOGA 
CATCAGCGTC 
TACAATTCGC 
GQATTQAGGA 
TCCGCCACAA 



CAGTTCCC6Q 
GCCGCTTCCC 
AAGGTGCTGC 
GAGAAACTCC 
GAAGAAATCC 
CCTCCCTTGG 
TGGGAGGATT 
TCCCCAACCC 



TTCTCAGA6G 
TCT0ACCCT6 
ATTAAGGAAA 
GAATCCTGGA 
ACCTCCCAGG 
ACTCCCTTGC 
TTA6ACCTCA 
CCAGGCTCCX: 
CTGGTCCTGG 



TCATGAGCTC 
T AGCT GAGGA 
TGTTTGGAGA 
AGCCTGGGGA 
AAGAGTGGCC 
CGTCCCAATC 
GGTOTGTCTC 
GOAGGAAACA 



CATCAGCTGT 
CTATGATGCC 
GGATGTGAAT 
AOATGGTGAG 
AAAGATGAGT 
GAATTGTCAC 
CTGGCAGAAC 
CA-rCAACAGC 
CCACTTTCCC 
CCTTTCCCTQ 
GACCATTCAC 
CCCAGGGTCT 

GCCACTGCTA 
ACTGOTOTTQ 
AGAGCTTGCC 
GGGGATAGCT 



GGAGCTACGG 
AATGGAGAGT 
AAAAQACQGA 
CCTAAGAGAT 
GAGTCCAACT 
AACACGCAAG 
ACTGCCAAGG 



CCGGGGCCCT 
CCTAACGGCG 
GAAAAC6CAG 
GGCTGCCCCT 
CCCCTGCCCA 
CTTGCAAGTT 
TAGTGGCCAT 



TCTGATGGAC 
CTGGAGCAGC 
TCTGTGTCTG 
ACTGAGAGGA 
TACTTTAAGC 
CACGACATGT 



CAACTCAGCC 
AAGTGACCCT 
CACCTGGAGC 
OCACTATCAA 



TCCTGTTCAA 
ACAGGAGTCr 
TCCAGCTGGG 
CCCCAACAAT 
TGGCAGTAGT 



GGAGACCTTG 
CCTTTGCGAG 
CAATAGCCTA 



CCACAATTGC 
AACATGACCA 
CCAOGGGTCA 
CAQCCCICGQ 



CTACTCTTAC 
TTTGAAAGAC 
GCCAGGCTGG 
GACGTCTGCC 
GACATTGGAC 
GGAATCACAG 
ACTCCCCCTG 
GCTCATACCT GGTACCTATC 



AGCGCATGAC 
ACATTGCCRA 
TTGTCCGGGA 
ACCGCTACTT 
CCGAGCACTT 



CCTCCCAGCT 



GGCTCACGCC 
GTGCCTCTGA 
AAAQTGCTCC 
TCTCCGTCCC 
CGGAGCCACA 
ACACAATGAA. 
AGQACOCACT 



CTCCCCGGCC 
TCCCACCCCA 
GGAAATGCTT 
6CATCTACT0 
TTCCCGCIG6 
CAGCTACTCC 
CTCCTCCACC 
CCCAGCCAAA 
CCCCTTGCCT 



CCTCTTTCTT 
CCTTTGCTTC 
CACT TAGC GA 
CCATCTTTCA 
AGACCCAAGA 
OTBAT TCAA C 
CCTCCCTGTG 



AGOGASTCCG CAXTGCCCCC 
CTGCAGGACC AG6GAAAGAG 
CAGTTCAGAC TATCAAGGAG 
GACCCATCAA AGTGGAGAGC 
AAGAGGAATC ATCTCACTCC 
AGTCCTACAG TGG6CTTAGG 




CTTTGGCAAC 
GGTTTCTGGC 
TGACASCCTC 



TAGCAOTCAC ACCCTAGCCA CTGCTGGGAC 



CTCTGAGTOA 
TATGCAAAAG 
CTGATTCCTC 
AAAGAGATTA 
CTGATCTTTG 
TAAATGTAAG 
ACCTOGOGTT 



GACCCCCTGG 
TCACC6CAAA 
TCTTCTOCCT 
CTTGCAGOCA 
AGCAAGATCC 
AACATCAACT 
CTCAAfiCTGT 



GGCTGAXGGA TCTCAGCACC 
GGCTCCTCAG TTCAGAACCC 
CAGATATAGA CGTCCCCAAG 
ATGGTTCTCr GACAGAAQGC 



GQTCCCAGTT TATTCCTGAG 



CTTATTTCCT 



CCCTCCCACC 
AAGTCTTTTG 
AGAGTGTGGG 
CCAGGGAGAC 
TGACCTGCCT 
GCTGACCGCA 
6ACCHAAAAA 



TATTGGGTCA 
TGCCCAGATG 
TGGCATTGAC 
GGCTTCCTTA 
TGGGIGTGAG 



CASACAAGIG 
TCCAAGTCAG 
GGAGTTGAAT 
TGCGCTATTA 
GAGAACTCAG 
GCTTGCCCCT 
CCAGCTTQAO 



TGCTGTCCCT GCCAGGAGCT 
agcctgtttc 
CGTGTAAATA 
TAGATCATTA 
TCTGTTCCTT 
GCTGAGGTAC 

QATCTGCTTG 
CTTTCCTGCA 
TTGGGGTGGG 
GATGTTTCTC 
GTGGAGGCTT 
CAGCTTTGCA 
AACACTAACT 



AGAAGAAATC 
AGGATGGATG 
TGATAATGTC 
GAGAAGGCCG 
AAGAGCCACC 
ACTCAATAAA 



GAAGGGTGG6 
CATTCTCTGC 
GTATAAATTC 
TCCAGAGACT 
QCTTTTAGTT 
CTGGATCTTO 
TTTTGCCCCT 
CTGGTTAAAA 
CAACTQAAGC 
CCCAATCATA 
AAAGGQCCCC 
CTAGGCCCCA 
AQCGAAGGTG 



1320 
13S0 
1440 

1500 
1560 
1620 
1680 
1740 

1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



2640 
2700 
2760 
2820 
2880 



3180 
3240 
3300 
3360 
3420 
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II I I I I 

KKASPRRPLI LKRRRIiPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKBV AESNSCKFPA 60 

QIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SQPNKFII.IS CGGAPTQPPG 120 

LRPQTQTSYD AKRTEVTLET I.QPKPAARDV NIiPRPPGAIiC EQKRBTCADG EAAGCTINNS ISO 

LStriOfniRKM SSOGUSSBST KQBMEEKENC KLEQKQVKVE SPSSPSASMQ KSVSERPSVS 240 

YMAKIQFAIN STERKBHTLK DIYTWIBDHF FYFKHIAKF6 HXNSIRHHIiS LHDHFVRBTS 300 

AHQKVSFWTI HPSANRYLTL DQVFKPLDPG SFQLPEHLES QQKRPHPEUt RNMTIKTEI.P 360 

IiOARRKMKPIi IiPRVSSYLVP IQPPVNQSIiV UIPSVKVPLP LAASU1SSBL AKHSKRVRIA 420 

PRVLIAEBQI APIiSSAGPGK EEKUiFGEGF SPLWVQTIK EEEIQPGEEM PHLARPIKVE 480 

15 SPPLEEWPSP APSFKEESSH SWEDSSQSPT PRPKKSYSGL RSPTRCVSEM LVIQHREHSE S40 

RSRSRRKQHL LPPCVDEPEIi LPSBGPSTSR WAABLPFPAD SSDPASQI.SY SQEVGGPFiCT 600 

PIKBTLPISS TPSKSVLPRT PESWRIiTPPA KVGGUJFSPV QTSQGASDPL PDPLGLMDI.S 660 

TTPLQSAPPL ESPQRLIiSSE PLDIiISVPFO NSSPSDIOVP KPGSPEPQVS GLAANRSLTE 720 
2^ GLVUmWDS LSKIIiLDISF PGLDEDPLOP OHIMHSQFIP BLQ 

Seq ID NO I 70 DNA sequence 
Nucleic Actd Accession ff: BC006529.1 
22 coding sequence: 178-2424 

1 11 21 31 41 SI 

I I I I 1 I 

GGCAGGAGGG GGACCOGGCC GGTCCGGOGC GAGCCCCCGT CCGGGGCXXTC GGCTCQOCCC 60 

CCAGGITGGA GGAGCCCGGA. CXXCGCCTTC GCSAGCTROGG CCCAACG60G GOGGCQACTG 120 

30 CA6TCTGGA0 GGTCCACACT TGrGATTCTC AAXaQAOAOT OAAAAOQCAO ATTCATAAT6 IBO 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCXCT TCCTGTTCAA 240 

AATGCCCCAA GTQAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGQCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCARG TAGTGGCC»T CCtXAACAAT 420 

35 GCTAATATTC ACaSCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGC3«rrAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCrCAAA CCCAAACCAG CTATQATGCC AAAAGGACAG AAGTGACCCT GGAJ3ACCTTG 600 

GOACCAAAAC CTGCAQCTAG QGAT6TGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CASAAACX3GG AGACCTOTGC AGATGOTGAG GCAGCAGGCT GCACTATCAA OUVTAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAQ 780 

CAAGAGATGG AGGAAAAOGA GAATTQTCAC CTGOAGCAGC GMaGGTTAA GGTTGAGGAG 840 

CXrrrCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCrrAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

45 AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1060 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCAGCAGAA ACX»CX3»AT CCAGAGCTCC GCCGGAACAT GACKATCAAA 1200 

ACCX3AACTCC CCCTGGGCGC ACGGCGGAAG ATGAAGCCAC TGCTACCACG GGTC3U3CTCA 1260 

TACCTGGTAC CTATCCAGTT CTCGGTGAAC CAOTCaCTGG TGTTGCAGCC CTCGGTGAAG 1320 

50 GTGCCATTGC CCCTGGCGGC TTCCCTCATG AGCTCAGAGC TTGCCOGCCA TAGCAAGCX3A 13 BO 

GTCCGCATTG CCCCCAAGGT GCTGCTAGCT GAGQAGGGGA TAOCTCCTCT TTCTTCTGCA 1440 

GGACCAGGGA AAGAGGAGAA ACTCCTGTTT GGAGAAaaGT TTTCTCCTTT GCITOCAGTT 1500 

CAGACTATCA AGGAGGAAGA AATCCAGCCT GGGGAGGAAA TQCCACACTT AGCGAGACCC 1560 

ATCAAAGTGG AGAGCCCTCC CTTGGAAGAG TGGCCCTCCC CGGCCCCATC TTTCRAAGRG 1620 

55 GAATCATCTC ACTCCTGGGA GGATTCXTTCX: CAATCTCCCA CCXXSU^GACC CAAGAAGTCC 1680 

TACAGTGGGC TTAGGTCCCC AACCXXSGTGT GTCTOGGAAA TGCTTGTGAT TCAACACAGG 1740 

GAGAGGAGGG AGAGGAGCCG GTCTOGGAGG AAACAGCATC TACTGCCTCC CTGTGTGGAT 1800 

GAGCCX3GAGC TGCTCTTCTC AGAGGGGCCC AGTACTTCCC GCTGOGCCOC AOAGCTCCCM 1860 

TTCCCAGCAG ACXCCTCTGA COCXGCCTCC CAGCTCAGCT ACTCCCAGGA AGTGGGAGGA 1920 

60 CXTTTtAAGA CACCCATTAA OOAAACGCTG CCCRTCTCCT CCACCCOGAO CJUiATCTGTC 1980 

CTCCCX3U3AA CXXX^XSAATC CIGQAGCSCTC ACGCCOCCAa CCAAAOTAGG GGGACTGOAT 3040 

TTCAGCCCAG TACAAACCCC OC31GGGTGCC TCIGACOCCT TCCCTGAGCC CCIGGGGCTQ 2100 

ATGGATCTCA GCACXACTCC CITGCAAAGT GCTCCCCCXX: TTGAATCACC GCAAAGGCTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCCCTTTG GCAACTCTTC TCCCTCa«3AT 2220 

65 ATAQACGTCC CCAAQCCAGQ CTCCCCGGAG CCACAGGTTT CTOGCCTTGC AGCCAATCGT 2280 

TCTCTGACAG AAGGCCTGGT CCTGGACACA ATGAATGACA GCCTCAGCAA GATCCTGCTG 2340 

GACATCAGCT TTCCrGGCCT GGACGAGGAC CCACTGGGCC CTGACAACAT CAACTGGTCC 2400 

CAGTTTATTC CTGAGCTACA. GTRGAGCCXrr GCCCTTGCCC CTGTQCTCAA GCTGTCCACG 2460 

ATCCCGGGCA CTCCAAGGCT CSkGTQCACCC CAAOCCTCTQ ASTGAQQACA GCXGGCAGGa 2520 

70 ACIGTTCIGC TCCTCAXAGC TCCCTGCTGC CTOAXTATGC AAAAGIAQCA GTCACACCCT 2SB0 

A6CCACTGCI GGQACCTTOT tfiTC C Xa a AG AaTATCTGAT TCCTCEGCTG TCCCIGCCAO 2640 

OAGCTOAAGG GTGGGAACAA. CAAAGGCAAT GSTOAAAAGA OATTAGGAAC CCCCCAGCCT 2700 

GTTTCCATTC TCTGCCCaGC AGTCTCTTAC CTTCCCTOAT CTTTQCAGGQ TGGTCCGTGT 2760 

_ AAATAGTATA AATTCTCCAA ATTATCCTCT AATTATAAAT GTAAGCTTAT TTCCTTAGAT 2820 

75 CATTATCC3VG AGACTGCCAQ AAGGTGGGTA GGATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTGCTTT TAGTTTTGAT AGAAGGGAAG ACCTGCaffiTG CAOGGTTTCT TCCAGGCTGA 2940 

GOTACCTGGA TCTTGGGTTC TTCACTGCAG GGACCCAGAC AAGTGGATCT aCTTGCCaaA 3OO0 

GTCCTTTTTG CCCCTCCCTG CCACCTCCCC CnGTTTCCAA GTCASCrTTC CTQCRAGAAG 3060 

AAATCCTGGT TAAAAAAGTC TTTTOTATTa GGTCASQAQT TGAAXTTGG6 GTOGGAGGAT 3130 

80 CK»TGCAACr GAAOCAGAGT crrOGCRGCCC JU3AT8TGCX3C TATTAGATOT TTCTCTGATA 3180 

ATGTCCCX»A TCATACCAGQ GAOACTGGCA nQAOOAGAA CTCAGGTGQA OaCTTQAGAA 3340 

GGCCGAAAGG GCCCCTGACC TGCCTGGCTT CCTTAfiCTTQ CXXXnCAGCT TTQCAAAGAG 3300 

CCACCCTAQG CCCC3W3CTGA CC33CATGGGT GTOAGCXAGC TTQAeAACAC TAACTACTCA 3360 
ATAAAAGCX3A AGGTGGAAAA AAAAAAAAAA AAAAAAA 
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I I I I I I 

MKTSPRRPI.1 LKKRHLPLPV QHRPSETSEE EPKRSPAQQE SNQABASKBV AESMSCKPPA 
5 GIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPHKFIIIS CGGAPTQPPG 
IiRPQTQTSYD AKRTEVTLET liGPKPAAROV NLPRPPGALC EQKHETCADG EAAGCTTINNS 
IiSNIQVniiRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVB EPSRPSASWQ NSVSERPPYS 
YMAMIQPAIN STERKRMniK DIYTWIBDHF PYFKHIAKPG WKNSIRHHLS LKDMFVRETS 
AHQIWSPWTI HPSANHYLTI. DQVPKQQKRP HPEIJiHKMTI KTELPLGARR KMKPU.PRVS 

10 SYLVPIQFPV KQSLVLQPSV KVPI.PLAASIi MSSELARaSK RVRIAPKVU. AEB6IAPLSS 
AOPGKEEKLL FQBSFSPLLP VQTIKBEEIQ P6EEKPKLAR PIKVE8FPLB EWPSPAPSFX 
EESSHSWEDS SQSPTPRPKR SYSGLHSPTR CVSEKIiVIQH RERRERSHSR RKQHLLPPCV 
DEPEUiFSEG PSTSRHAAEI. PFPADSSOPA SQIiSYSQEVG GPFKTPIKET IiPISSTPSKS 
VLPRTPBSHR LTPFAXVGCb OFSPVQTPQa ASDPU>I>PU3 LKDIiSTTPIiQ SAPPLESPQR 

15 UiSSEPIJlIiI SVPFGHSSPS DIDVPKPGSP EPQVSGIAAN RSI/TGGIiVIiD THMDSLSKIL 
UDISFPGUtE OFL6PDNINH SQFIPBU} 

Seq ID NOi 72 DHA sequence 
20 Nucleic Acid Accession ft: Tn4612.1 
boding aequenee< 178-2583 

1 11 21 31 41 51 

2j GGCAOQAGGG GOACCaSGCX gotccxsgcgc gagcccccgt ccgggcccct ggctcbgccc 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAAC33GCG GCGGCQACTG 
CAGTCTGGAG GGTCCACACT TGTGATTCTC AATCGAGAGT GAAAACQC3U3 ATTCRTAATG 
AAAACTAGGC CCOOTCGGCC ACTGATTCTC AAAAGAGOOA GGCTQCXWrT TCCTGTTCAA' 
AAT6CXXX»A GTGARACRTC AGAGQAGGAA CCTHAGAQAT CCCCTGCCXA ACAGGAGTCT 

30 AATCAAGCAO AGGCCTCCAA GGAASTGGCA GAOTCTAACT cntKaAGTT TCCAGCTGOO 
ATCAAGATTA TTAACCRCCC CACCATGCCC AACACX3CAA6 TAGTOGCCAT CCX3CAACAAT 
GCXAATATTC ACAGCATGAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 
GGGCCCAACA AATTCATCXTT CATCAQCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 
OGGCCTCAAA CTCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 

35 GGACCAAAAC CTGCAGCTAG GGATCTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 
CAGAAACGGG AGACCTGTGC AGATOSTGAO GCAGCAGGCT GCACTATCAA CAATAGCCTA 
TCCAACATCC AGlGGCrTCG AAAGATGAiST TCTGATGQAC TGGGCTCCCG CAGCATCAAG 
CAAGAGA1X36 AOQAAAAlGGA OAATTGICAC CIGQAGCAOC C 
. . CCTTCGM3AC CATCAGOGTC CTQGCAQAAC TCTOTGTCTG I 

40 ATGG0C3VTGA TACAATTOGC CATCftACAGC ACTSASAGSA AGCSCATSAC TTTSAAAGAC 960 

ATCTATACOT GGATTGAGGA CCaCTTTCCC TACTTTAAGC ACATTGCC31A GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACSVA CCTTTCCCTG CACGACS^TGT TTOrCCGGGA GACGTCTQCC 1080 

AATG6CAAGG TCTCCTTCTG GACCATTCAC CCC31GTGCCA ACCQCTACXT QACATTGOAC 1140 

CAGGTGTTTA AGCCACTOGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 1200 

45 CAGAAACX5AC CSQAATCCAOA GCTOOGCCGQ AACATGACCA TCAAAACCGA ACTCCCCCTQ 1260 

GGCGCAOGGC GGAAGATGAA QCCACTGCXA CGRCQGGTCA GCTCATACCT GGTACCTATC 1320 

CAGrrCCOGG TQAACCAQTC ACIGGTGTTG CAGCCX:TCGG TGAAGGTGCC ATTGCCCCTS 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAGTCCG CATTGCCCX:C 1440 

AAGGTTTTTG GGGAACAGQT GGTGTTTGGT TACATGAOTA A6TTCTTTAG TGGCGATCTG IS 00 

50 C5GAGATTTTG GTACACCXSIT CACCAOCTTG TTTAATTTTA TCTTTCTTTQ TTTATCAGTQ 1560 

CTGCTAGCTG AGGAGGG6AT AGCTCCTCTT TCTTCTGCAG 6ACCAGGGAA ASAGGAGAAA 1620 

CTCCTGTTTG GAGAAGGGTT TTCTCXTTTG CTTCCAGTTC AGACTATCAA GQAGGAAOAA 1680 

ATCCAGCCTG GGGAGGAAAT GCCACACTTA GOSAGACCCA TCKAAGTGGA GAQCCCTCCC 1740 

TTGGAAGAGT GGCCCTCCXX: GGCCCCATCT TTCAAAGAG6 AATCATCTCA CTCCTGGGAG 1800 

55 QATTCQTCCX: AATCTCCXaC CCCAAGACCC AAGAAGTCCT ACAGTGGGCT TAGGTCCCX» 1860 

ACOCGGTGTG TCTCGGAAAT GCTTGTGATT CAACACAGGG AGAGGAGGGA GAGGAGCC3GQ 1920 

TCTCGGAGGA AACAGCATCT ACTGCCTCCC TGTGTGQATG AGCCXX5AGCT GCTCTTCTCA 1980 

GAGGGGCCCA GTACTTCCCG CTGGGCCGCA GAGCTCCCGT TCCCAGCAQA CTCCTCTGAC 2040 

OCTGCCTCXX; AGCTCAGCTA CTCCCAGGAA GTGGGAfiGAC CTTTTAAGAC ACCCATTAAG 2100 

60 OAAACXXZTGC CCaTCTCCTC CACCCaBAGC AAATCCGTCC TCXXX»GAAC CCCTQAATCC 2160 

TGGAGQCTCA OGCCCCCAGC CAAASTAG60 GGACTGGATT TCAOCOCAar ACAAACCTCC 2220 

CAGGGTGCCT CTGAaXCTT GCCTOACCKC CIGGGQCraA TOGATCTCnC CACCACTCCC 2280 

TTGCAAAQTG CTCCTCCCCT TGAATCACCG CAAAGGCTCC TCRGTTCRaA ACCCTTAGAC 2340 

CrCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAGATA TAQAOGTCCC CAAQCCAGGC 2400 

65 TCCCCGGAGC CACAGOTTTC TGGCCTTGCA GCCAATCGTT CTCTQACAXsA AGGOCIGOTC 2460 

CTGGACACAA TGAATQACAG CCTCSU3CAAG ATCCTGCTGQ ACATCAGCTT TCCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAG 2580 

TAGAGCCCTG COCTTGCCCC TGTGCTCAAG CIGTOCACCA TCCOGGGCAC TOCftAGGCTC 2640 

AGTGCA<XCC AAGCCTCTGA 6TGA6GACAG CAaGCAOGGA CIGTTCTQCt CCTCATAOCT 2700 

70 CCCTOCTGCC TGATXATGCA AAAGTAGCAG TCACACCCTA 6CCACTGCIC GOACCTTGrG 2760 

TTCCCCAAGA GTATCTGATT CCTCTQCTGT COCTGCCAQ8 AGCTGAAQGQ TGGGAACAAC 2820 

AAAGGCAATG GTGAAAAGAQ ATTAGGAACC CCCCAOCCTG TTTOCATTCT CTGCECAGCaV 2880 

GTCTCTTACC TTCCCTGATC TTTGCAGGGT GGTCCX3TGTA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATTATCCAGA GACTGCOVQA 3000 

75 AGGTGGGTAG QATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTTTT AGTTTrGATA 3060 

QAAGGGAAGA CCTGCAGTGC AOGGrTTCTT CC3U3QCTGAG GTACCTGGAT CTTGGGTTCT 3120 

TCACTGCAGG GACCCAGACA AQTQGATCTG CTTGCCaOAG TCCTTTTTGC CCCTCCXTTGC 3180 

CACCTOXCG TGTTTC«AG TCAGCTTTCC TGCAAGAAGA AATCCIGQTT AAAAAAGTCT 3240 

TTTGTATTGG OTCAOGABTT GAATTTGGGQ TGGGAGGATG GftTGCAACIG AAGCAGASTG 3300 

80 TGGGIGCCCA QATSTOCGCT ATTAGATGTT TCTCIOATAA TQTCCCCAAT CATACCAGGG 3360 

AGACTGGCAT TGAOSAGAAC TCAGGTGGAG aCTTGAOAAG GOOQAAAGGS CCCCTSACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCTCAGCTT TGCAAAQAGC CACCCTAGGC CCCAGCTGAC 3480 

CGCATGGGTG TGAGCC3M3CT TGAQAACACT AACTACTCAA TAAAAGCGAA GGTOGtRCAAA 3540 
AAAAAAAAAA AAAAA 



216 



wo 02/086443 



PCT/US02/12476 




217 



wo 02/086443 

CCXUVrCTCCT GCAACAAGOA. CCTGTCCTTT GGCCACTCTA GGGCCAGCTC CAAOKTCTGC 960 

ASTOiWSGAaV TOOAGTSCAG TGOGCZGACC ATOCCCftACG CTGTGCAGTA CCTOASCTCC 1030 

CAGGATQA6A AGTACCAGGX: CATTGGGGCC TATTACATCC AOCATACCTG CTTCX3W3GAT 1080 
GAATCTGCXU^ AGCAACAGGT CTATCAGCTG GGAGQCATCT GCAAGCTGGT GGACCTCCTC ' 1140 

5 0C5CRGCCCCA ACCAGAACGT CC3W3CAGGCC GCGGCftGGGG CXXTTGCGCAA CCTGGTGTTC 1200 

AGGAGCACCA CCRACAAGCT GGAGACCCXX3 AGGC3«3AATG GGATCXS3CGA GGCAGTCAGC 12 SO 

CTCCTGAGGA GAACCGGGAA CXXXX3AGATC CAGAAGCAGC XGACTGGGCT GCTCTGGAAC 1320 

CTGTCTTCCA CTGACQAGCT GAAGGAGGAA CTCATTGCC3G ACGCCXTTGCC TGTTCTGGCC 1380 

GACOGCCTCA TCATTCCCTT CTCTGGCTGG TGOGATGGCA ATAGCAACAT GTCX:CX3GQAA 1440 

10 GTGQTGQACC CTGRGGTCTT CTTCAATOCC ACAGGCTGCT TOAGGAACCT GAGCTOqaCC ISOO 

GATOCAOGOC GCCAGACCAT OOBrAACTAC TCRGOGCTCA TreATTOCCT CATGGGCTAT 1560 

GTCCAGAACT GTGTAGC6GC CAGCCGCTGT GACGACAAGT CTGTGGRAAA CTGCATOTGT 1620 

QTTCTGCACA ACCTCTCCTA CCGCCTGGAC GCCQAGGTGC CCACXX»CTA COGCCAGCTG 1680 

GAGTATAACG CCCGCAACGC CTAC3^CCGAa AAQTCCTCXav CTGGCTGCTT CAGCAACAAG 1740 

15 AGCGACAAGA TGATGAACAA CAACTATGAC TGCCCCCTGC CTGAGGAAGA GACCAACCCC 1800 

AAGGflCAGOG aCTGGTTGTA CCATTCAGAT GCCATCCGCA CCTACCTGAA CCTCATGGGC 1860 

AAGAGCAAGA AAGATGCTAC CCTGGAGGCC TGTGCTGGTG CCCTGCAGAA CCTGACAGCC 1920 

AGCAAGGGGC TGATGTCCAG TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGGGCCTG 1980 

CCACAAATTG CCOGCCTCCT GCAATCTGGC AACTCTQATQ TGQTGOGGTC CGOAGCCTCC 2040 

20 CTCCTGAeCA ACATQTCCCG CChCCCTCIG CTGCACAOAG TGATGGGGAA CCAGGTGTTC 2100 

CCGeSAGOTGA CCAGOCTCCT CACCAOCCAC ACTGOCAATA CCAGCAACTC GQAAGACATC 2160 

TTOTCCTCGG CCTTGCTACAC TOTSAGGAAC CTQATQGCCT CGCAGCCACA ACTGGCCAAG 2220 

CAOTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCGAAQ CAGTGCCTCA 2280 

CCCAAGGCXJG CAGAABCTQC COOGCTTCTC CTGTCTGACA TGTGGTCXaG CAAGGAACTG 2340 

25 CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGQQAACCTT AGCTGGGGCC 2400 

AACAGCCTCA GGAACTTCAC CTCCCGATTC TAAGAAGAGA CTGTCCAAGC AAGTTAGGCT 2460 

TGCAGGAAGA TATOACCCAG CTGAGAAGCC CTCAC3GCCTC GCTGGAT6GG GTTTTCTGTC 2S20 

CATCCTOTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAQAAACCTA AAAACTGTGQ 2580 

ATAGTGOAAA QATTTTTAGA TTTTrrrrTT CCTTGGGGAA ACTGGC3U3GC AATGGGGGTT 2640 

30 AGGOAGGTTG OGGGSGGGGS GQCTTTCTTO AGTTAAAaGG GCTTATATGT GATGTCAATA 2700 

TTTCTTCCTC TGAGAAATGQ TATATATATQ TGTCTAATGT AAOTGTGTGC ATGCATQTGC 2760 

GCGTCCATGT GTGTQTGTQT QAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2820 

TAAGCIATTT TGTTGCAGCT CATAAGGT6G TGAAAAGGAC TCTCCTGTGT TTCTTACTCA 2880 

TAGGCAAGGA CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTGAAA TGTGTGGTGC 2940 

35 CAGGGCAAGG GGGCCATCAC TGCAGTCAGG CCCTCAGAGG AGTCCTGCAG GCTTCCTACC 3000 

AGTGGTCTCC AAGQGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCX:C TTACAAGGCT 3060 

GCTTTCCACQ AAGGGAGGTC TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 3120 

ACOXTCCAG CAGCGCCACA AGQACTGAGG TTGGGTAGGT GTGAGGTTCC: AGAGGACAGC 3180 

AG6ACACTCT OGCATACTTT GCCAAATGAG GCCTGCTCAG AOGAGTAGGA GCTGAAAGAT 3240 

40 GGTGCCTTCC ACCCTCTTGG GCTGTGTGCC CATCAGAGCA GGCTCAGCCT GCAAAGGCCC 3300 

TGCATTC3«3A GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAGGTAAAA AATGATTTTT 3360 

TTAASAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGGCrGGTCT 3420 

TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCaV TATACCTATT CCGGCTTCTA 3480 

GTGGGATGGA GTTGGGGTAT AGAAATTAAC CAGGAAGATQ TTTCCACCAA aCCrGCTGTO 3S40 

45 AGTCAATTGA GGGAGTGTTT GGGTCCCAGG AGACTTGGAC GGGGGGAOTT TGGGTAGACT 3600 

AGQAAAGGAA AGTGCCATAT CAGGGTACCO GTACCGGCAA GCTCACATCT CAGCCAGGGG 3660 

CCaTGCCXTA CTTCCCCTGA CCXS3VGCTGT CTTGTCTCCA CTCTGTOAAA CCCACAGGGQ 3720 

ATGTGATAAA CAGGGCTATT AGGGGTATCA GCCACGTCGA GCXX:CCAGAC TCTGTGCACT 3780 

TCAGACCAGC JU3CAGCAGGA GGGCTCCXXSA GGGCCTTATG AGAAAACXTTQ TGTGGACATC 3840 

50 CCTTGGTGTA CACTAAGACA GAGC3M3AGCC CAGOGCTCCX: AAGCCTTCCT OCTTCCAGCX 3900 

TCTACCTCCaV TQCTAOCATT GCTQGTGTTA OAaAGOAATT AACTTCC3GG XCTGTGCCCT 3960 

TCTCTAGAAO AATATAAOAT GCTCCTOCTC CTCACCCCTT CTCAfiCCTCC TCCCAAGTCT 402O 

TCCTCTTCTG CACCACCCXK OAGTCCAAAC CCACCTCTTG CCCCAlBCATT C3U3GCTGGAA 4080 

AACACTGATG TaGACTCAOT ATGACAACTG AGATGGGGQA AGCCASACAT GTGAGQACGC 4140 

55 TGTCCTCXKA GAGGTGTCCC OGGCTGTTAG CCAGCTGTGC TGTGGTGCTG TGQGTCTGTC 4200 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CTGGCTCACC TCTCCCTCTC 4260 

AGGGACCCAC GTGGQAGCCT GGATCCCTQG ACTGTCCTGG GCATAGGTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCyW3AACCC AGAGGAATTC TTCTCCTAAA AAATACGTAT GGCATACCAA 4380 

TCTGTGOGOO GC3W3TQTCCT AAGCACTTAG ACTACATCAG GGAAGAACAC AGAOCACATC 4440 

60 CCaSTCCTCA TCOQGCTTAT OTTTTCTGGA GGAAAGiaGA OACACAAGTC CtTGGCTTTA 4500 

GGGCTCOCCC QGCTG6G06C TSIGCaWSTCC OGrCAGGGOa GQAGGGGAAA TGCACCXXnG 4560 

CAXQT6AACC TTRCCRGCCC AGGGGGATQC CCCCTCCCCT TAGCACTACC CTQGCCTCCT 4620 

GCATCCCCTC GCCTCATGTT CCTCCCACCT TCAAAGAATQ AAGAGCICCCA TGGGCCCAGC 4680 

CCCTGCCCTG GGAfVCCAGGC ACCCTTCCAG ACCTCAGGGQ CTGAGGCAGA CTATTMGGC 4740 

65 AGGGCTGACT TTGGTOACAC TGCCCATTCC CTCTCAGGCX: AGCTCAGGTC ACCaSGGCXTT 4800 

CTGAIXCAGG CCTGTCACTT TGAGAGGGGC AAAACTGAQA GGGGCTTTTC CTAGAGAAAG 4860 

AQAACAAGGA GCTTGCCAGG CTTCATGTAG CCGACACACXS TCTCAGGATT TTAAGTCCAC 4920 

ATTGGCCTCA CaCTAGCCTA GGCCRATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4980 

TOAGQAAGQA CACaiQACTCT GCCCTGGGAT CTCXrrGTGCT AGOGGCCAAT GACAAATCCA 5040 

70 GTCATTGGCC AOSUSCCAOC TCTGC»GTGG QGROCACRCT RGCRGOXTO ACTCCaCACT 5100 

CCTCnOGGG ACCX»A6AGG CAGTCTTQCT eTCTOtXSTOT CCACCTTGOA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCSUV6A CTGOGGCTGG GaTGOQCAGG GAAGGGAAOC CGGGGOCTGC 5220 

TGTQAGGSAT CTTGOAQCTT CCCTGTAGCC CACCTTCCCC TTGCTTCATQ TTTGTAGAGG 52 80 

AACCTTGTGC OGGCCAGGCC CftGTTTCCTT GTSTQAKACA CTAATQTATT TGCTTTTnT 5340 
75 GGAAATAGAG AAAATCAATA AATTGCTAGT GTTTCTTTaA AAAAAAAAA 



PCT/US02/12476 



85 



1 11 21 31 41 51 

I I I I I I 

MNHSPLKTAIi AYECPQDQDM STIAIiPSDQK HKTOTSGHOR VQEQVMHTVK RQKSKSSQSS 
TLSESNRGSM VDGLADMYNY GTTSHSSYYS KFOAGNGSMG YPIYNGTLKH EPDNRRFSSY 
SQMEKWSRHY PHGSCNTTQA GSDICFKQKl KASRSEPDLY CDPRGTLEKG TLGSKGQKTT 
QtmYSFYSTC SGQKAIKKCP VRPPSCASKQ DPVYIPPISC NKDIiSFGHSR ASSKICSEDl 
ECSGI.TIPKA VQXI.SSQDSK YQAIGAJOflQ HTCFgOESAK QQVYQLGGIC KLVDIiRSPN 
QNVQQAAAGA IjmbVFBSTT NKIiETRRaiilG IHBAVSIiURR IXaJAEKJHQIi TCU*1NI.SST 
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DELKEELIAD ALPVLADHVI IPPSGWCDGN SNMSREWDP EVFFHATGCb DNLSSADAQR 
QTMRNYSGLI DSLMAYVQNC VAASHCDDKS VBHCMCVLHN I.SYRLDAEVP TRYRQLEYNA 
RHAYTEKSST GCPSHKSOKH MNHMYDCPLP EBETNPKasQ WLYKSDAIRT YUILMGKSKK 
DATIiEACMSA IiQHbTASKieb MSS6MSQLIG USKGIiPQIA WJ^SOHtSnV VRSGASUiSH 
MSHHPLLHRV MGNQVFPBVT RLLTSHItarT SNSEDILSSA CTTViaiUaS QPQLAKQYFS 
SSMLNHIINL CRSSASPXAA EAAKLLLSDM WSSKELQGVL RQQGFDRNML 6TIAGANSLR 
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Seq ZD NO > BO ONA sequence 
Mudelc Acid Acceaaion ft: MM_00GS16.1 
180-1658 ~ 



TAGTCGOGGQ TCCCCGAGTQ fl 
QTCASAGTOQ CAGTGGGAGT C 
TCGCCACCCX5 
CAGCAAGAAG 
QCAGTTTGGC 
CAACCAQACA 
CTGGTCCCTC 
CCTTTTCGTT 

cgtgtccxk:c 
gggccgcttc 
gggtgaaqtg 
ogtcgtoggc 
cctgtggccc 
gcccttctgc 
caagagtgtg 



3 GGAGCAGGAG 



TTGGCTCCCT 
AGGAGTTCTA 
TCACCACX3CT 
TCTCTGTGGG 
TGCTGGCCTT 
TGCTGATCCT 



CTGACGGGTC 
TACAACACTG 
TGGGTCCACC 
TC3\QTGGCCA 
AACXXSCTTTG 



GCAGCCAGAG 
GCCTCATGCT 
GAGTCATCAA 
GCTATGGGGA 



ACCAAACGAC 
CTGAGGGGGA 
CC3VCCAGCXK: 
GGCTGTGGGA 



AGCTGQGCAT 
GCAACAAGGA 
GCATCGTGCT 



TGCAGGAOAT 
AGCPGTTCOG 
CCCAGCAGCE 
OGGGGGTGCA 
CTGTCGTGTC 
TCGCTGGCAT 
TACCCTGGAT 



ATCATCXMTG 
TCACCCACAG 
ATCCTCATCG 
CTGCTGCTGA 
CCCGAQAOTC 
CTAAAGAAGC 



GCTTCTCGAA 
TGTACTGCtM 
CXrrTTCGTGG 



GCATCATCTT 
TGCGCGGGAC 



CTCCOCOGCC 
GTCTGGCATC 
GCAGCCTGTG 
GCTGTTTGTG 
GGOGGGTTGT 
GTCCTATCTG 
CKCCATCCCA 
TGCCGTTGCA 
TGTGGAGCAA 
CTTCATCTTC 
TTCCGGCtTC 



TACOQCXaGC 
AAOGCTQTCT 
TATGCXACCA 



CXaVTCCTCAT 
TCTATTACrC 
TTGGCrcCGG 



GAGCATCCTG 
TGGGGGCATG 
TTCAATGCTG 
ACTGGGCAAG 
CCTGACCACA 
GGCCCTGGGC 
CGGCCTGGAC 
CATCCCGGCC 
GCTCATCAAC 
AGCTGACGTG 
GAAGAAHGTC 
CGCTGTGGTG 



GCCATACTCA 
AGCATCGTGG 
TGGTTCATCG 
GGCTTCTCCA 
CTGTGTGGTC 
ACCTACTTCR 



CCRTCTTTGG 
TGGCTGAACT 
ACTGQACCTC 
CCTACGTCTT 
AAGTTCCTGA 



TATCGTCRAC 
GACCCTGCAC 
GCTAGCACTG 
CTTTGTGGCC 
CTTCAGCCAG 
AAATTTCATT 
CATCATCTTC 
GACTAAAGGC 



ATGATGAACC 
TCCTTTGAGA 
GGCTTCGTGC 
ACCCTGCACC 
TCCATCATGG 
CTGCTGCAGT 
CGCAACGAGG 
ACCCATGACC 
ACCATCCTGG 
CTGCAGCTQT 
TTOSAOAAQG 



CTCATAQGCC 
CT GGAG CAGC 
TTCTTTGAAG 
GGTCCAOGTC 
QTGGGCATGT 
ACTGTGCTCC 
CGGACCTTCG 



C AQCAGCCCTA AGSATCTCTC 



! ATGTCAGCOG 
CCAOAAGAAT ATTCAGGACT 
AAATCTATTC AGACAAGCAA 
ATATCAGCCT QAGTCTCCTG 
GAGGGTGGAG ACTAAGCCCT 
CTGGACCTAT GTCCTAAGGA 
GAGGTGGCTA TGGCCACCCG 
CATTAGGATT TGCCCCTTCC 



TAAC GGCT CC 
CROSTTTTAT 
TGOCCACATC 
GTGGA6ACAC 
CACACTAATC 



AAGTGTGAGT CGCCCCAGAT CACCAGCCCG 
AGGASCACAO GCAGCTGGAT GAGACTTCCA 
GGGQCTCCTT TCTCCAGCCA GCAATGATQT 
AGGATTTTAA CAAAAGCAAG ACTGTTGCTC 
AATTTTTTTA TTACTGATTT TGTTATTTTT 
CCAGGCTTCA CCCTQAATGG TTCCATQCCT 
TTGCCTTCTT CACCCAGCTA ATCTGTAGGG 
GAACTATGAA CTACAAAGCT TCTATCCCAG 
CTG6ATCTCC CCACTCTAGG GQTCAGGCTC 



TGCAAGATAT TTATATATAT TTTTGGTTGT CAATATTAAA 
ATATCTGGAC AAGCCAACTT GTAAATACAC CACCTCACTC 
TATAAATGGC TGGTTTTTAG AAACATGGTT TTGAAATGCT 
TTTGGATGGG AGTGAGACAQ AAGTAAGTGG GGXXGCAACC 
aACTGAGGAT CCaVOTCCCTT ACACGTACCT CTCATOUm} 
TTTOATCCCT OTTACCCAGA GAATATATAC ATTCTTTATC 
KTCACaTATT TGATAGITGO TGTTCAAAAA AACACTAGTT 
AGGCTTQAAA TCOCATTATT TTGAATaTGA AGGGAA 



TACAGACACT AAGTTATAGT 
CTGTTACTTA CCTAAACAGA 
TGTGGATTGA GGGTAGGAGG 
ACTGCAAC6G CTTM3ACTTC 
TCCTCTTGCT OUkAAATCTG 



TTGIGCCAGC GGTGATGCTC 



11 



21 



31 



41 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 

1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
3460 
2S20 

2640 
2700 
2760 
2820 



I I I I I 

HBPSSKKbTO RIMIAVGGAV LGSLQFGYNT GVINAPQKVI EEFYNQfTWVH 

LTTIiHSLSVA IFSVQQMIGS FSVGLFVNRF GRRNSMUWM UjAFVSAVLM 

taJLGRPIlG WCOLTTGFV PMYVGEVSPT APRGAIXSTLH QUJIWOH,! 

GNKDLWFLLL SIIPIPALLQ CXVI.PPCPES PRFUiINRNE ENRAKSVI.KX 

LQEMKEESRQ KMRBKKVTIL ELFRSPAYRQ PILIAWUXi BQQtiSGINAV 

AGVQQPVYAT IGSGXVMTAF TWSI.FWER AGRRTLHIiIO I.M3MA6CAII. 

LPWMSYLSIV AIFGFVAPFE VGPGPIPWPI VABIJ'SQGPS PAAIAVAffi'S 

CPQWBQLCG PYVPIIFTVL LV1.PPIFTYP KVPBTKBRTP DEIASGFRQO 
BLFHPI.GADS QV 



RYGBSILPTT 60 

GPSKLGKSFG 120 

AQVFGliDSlM 180 

LROTADVTHD 240 

FYYSTSIPBK 300 

MTIALAUiBQ 360 



QASQSDKTPE 480 



I I I I I I 

GGGGGCGCOG CGCGCTGACC CTCCCTGGGC ACCOCTGOGO ACQATGOCGC TGCTCQCCTT 
GCIQCIGGTC GTGGCCCTAC CGCGGGTGTG GACAGA03CC AA0CIGMnX3 GGAGACAACG 
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AGATCCAGAG GftCTCCCAGC 
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GAACGOACGA G 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CTCOSTTATA GX3GGC0GTOA. 



AAATATTTCC 
AGAGACCCAA 
TCAAGTGTTG 
TCAAAGAATA 



CTGAGCCTTC 
TTTCIGTTQA 
AGGTGCAGTT 



GCCaGAGQAQ 
TAAAATTCGC 
TGCTGGGAGC 
CTCCATTGCA 
CGGAGCATQG 
TTAOCTCTTG 
GGCTCTTAAC 



AAGCGGTTTC 
TACTGCAATT 
ATGGGTGAGA 

GccGGcxrrcA 
ACTCGcrcxa 

GTTTGACTTC 
CCrCAAGGGT 



CTGCCACAOA 
TTAAACTTGT 
AGTGGGGATC 

GAA6TCCAGA 

TTCTAACTCA TTTATT6CI6 ATQOOCACIC TTTTCCTT6A 
GAGGCCTAAG TACCACTCAT 
QCAGGAACAC TGGGGGAGTC 
CTCAGCATGG GGGGCAGTGG 
CTGTGGATQG CTGCTTTTCC 
AATTGTGTTG AAOAAACTTA 
ATTCCCACAC GTGTGTGTTC 
ATG60GTTCA TTTCTCTGTT 



CTTCAGTATT 
CTTCCGACCT 
ATCCCTGQAG 
CCATTCCAGT 
CCAGGAGGCA 
ACAGGTGCAC 
AGTAGAGA6C 
GAGCTGAOAA 

CACQAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 



GCCTGTCTTG 540 

GACCGTTGTC 600 

CCAGGGTCTT 660 

TCTTTAACTC 720 

TTTCTCTTTG 780 

CTOCCCTCK3 840 

GQAGAGTATG 900 



AGCCACGGQA 
ACCIGTTGCA 
GGGATGGGAG 
ACATTCAGAG 
AAATCAAACC 
CCTCTGAGGG 
TGCTGAGATG 
GGGTGAAGAC 
AGGGCTGCrC 
CTACCAQATT 
ACCAGCTGGC 
ACTTAGOCCA 
CATCCATQG6 
TTCAAAAGTT 



i I I I I i 

MALIiALLLVV AI.PRVWTDAW ZiTARQRDPBD SQRTDKGDNR VWCHVCKRKN TFECQNPRHC 
KHTEPYCVIA AVKI7PSPFM VAKQCSAGCA AMERPKPEEK RFLIiEEFMPF FYLKCCKIRy 
CNIiEOPPItrS SVPKEYAGSM GESCQaLHIA lULLASIAA GLSLS 



GAAGATAACT 
AGATTCATAA 
ATC3UX3GTAA 



GGCACACGTT 
TCAACCTTTC 
GACTTCACCC 
AACATCTGAA 
AAGATGCAGC 



Seq ID NO: 84 DNA sequence 

Nucleic Acid Acceasion Ss NM_022893.1 

coding sequence: 229-2726 



I I 
AAAAAAAAGC CATGACGGCT 
TTGTATTATT TCTAAT TTAT TTTSQATQTC 
AGTCTCCTTC TTTCTAACCC GGCTCTCOCG 



CTCXKACAAT 



TTTTCTCTGG 
CGCCCGCCGC 
AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 



AACCCCAGCA 
ATGATGAACC 
GTGGGCAGTG 
GQAAACAA1G 



cttaagcaaa 
agaccacggc 
ccagatgaac 
caatggcaxk: 
gatqaaaaaa 



CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
GG ATCAG TAT 
CTGTTTAGTC 
GAAGAAA7GG 



CAGATAAACT 
TCCCCAOGCC 
GCAGCTACAC 
ACGCACAGAA 
CCCCGCQGGT 
ATGGGATTCA 
OGAGAGAGGC 



CCATTOCAGC 
TCCGGCCCTC 
ACGTTCAAAT 
TACAAQTGCA 
AAGACXXaVCA 
GCCAGCTCCC 
TCOGTGQTGQ 
GAGGAG6AAG 
CTGACGGAGA 
CACOAGAACA 
GACGTCATGC 
GTCCTGGGCG 
TGCGACGAAG 
CGCGGCTGCT 
AGCCCCAGCT 



CACCACOSAG 
CCCTGGCCAC 
TGGAGCCTCC 
GCrCACCGCT 
CAGGTAGCAA 
CTCCCTCCCA 



TGGQATGAGT 
ATGTACAACT 
CACTCATGGA 
TGGTATCX:CT 
TATTGCAGAC 
TCXX3GCCTG 



ACATCACTTG 
CCATCACCCG 
CGCCATGGAT 
GTCCCCAGGC 
GCOGCCCTTC 
GCCCCGGGTC 



CGGGAATTCT 
CCGTTGGGAG 
TTCCC3V.TTGG 
CTCTGCTTAG 
GCATCCAATC 
ACGTCATCTA 
AGOaaCCTCT 
GC3VGAATATG 
TGCAAACAGC 
TTAAGAATCT 
TCAGGACTAG 
AATAACCCCT 
GCAGA 



CTCCAGAAGG 
GGGACATTCT 
AAAAAGCTGT 
CCGTGGAGGT 

gaaoaatttg 
cctcccx:tcg 
cxiccgcaggg 
cattcaccag 
acttagaaag 
gtgcagaato 
ttaacxttgct 



GACCCCCACC 
AGTGCCTTTG 
TTCTCTAGGA 



GCATAGAGCXS 
ACAGGGTGCT 
GACTTAGAGA 
CTATGCAAAQ 
CCCCCCTCCC 



51 
I 

TCATCTTCCC 
GATOAAGATA 
AGCCGTCGTC 
GTCTCGCCGC 
TCTTGAAGCC 
GGATCATGAC 
TATTTTTATC 
GGATAAGCCA 
TGGCATCCAG 
CCCCAAACAG 
TTCTGCaVCAT 
TATTrGTAAA 
TGCATGGTTT 
CGAACACGGA 
TCCTTCXrCAG 
AAGAATACCA 
CACTCCCCCC 



!AAT 
GCTGGCAGGG 
OTTACTGCAA 
TCCTCTGCAA 



CCACQOOTQC ACCCAGGCC3V GCAAGCTGAA GOGCCACATG 
GTCCOCCATG AI3GQTCAAGT COGACGACaG TCTCTCCACC 
CACCAGOGAC TTGGTGGGCA GCGCCAGCAG CGCGCTCAAG 
GAGCX3AGAAC GACCCCAACC TGATCXX33GA GAACGGGGAC 
CGAGGAAGAG QAAGAAGAGG AGGAAGAGGA GGAGOAGGAG 
GQACTAOGGC TTCGGGCTGA GCCTGGAGGC GGCGCGCCAC 
O0CGGTC!GTG GGOGTGGGCG ACGAQAGCCG CGCOCTGCCC 
GCTCftGCTCC ATGCAGCACT TCAGCXSAGGC CTTCCACCAG 
AGAAGCATAA GC330GGCCAC CTGGCCGAG6 CCXSAGGGCCA CAQOGACACT 



CCAAGTTCAA 
AGGAGGACGA 
GCGAGAGGOT 



gcctcctojt 
gagctggaos 
attagtggtc 

GAGTACTGTG 
ACX3GGCGAAA 
CTCACCAQGC 
TGTAAGATGC 
GATOGAGTGT 
CTCCCACCTG 
CXrrGTAGGAT 
ACGAAGCTAA 
TTCTTTTTTC 



CX3CATCAAGC 
GTGTACTCGC 
AGCTTCX3GAG 
GGGAGCTTGC 



STCGGCCTCG 
GGCIGAGCCC CTTCTCTAAG 
C6ATGCCCAA CACGGAGAAC 
AGCrCAAAGA TCXX^TTCCTT 
CQGAGCACTC CTCGGAOAAC 
GAGGGATCTC GGGGCGCAGC GGCAOQGGAA 
OQGGCACGGG CAGGCCCAGC TCAAAAGAGG 
GGAAAGTCTT CAAGAACTGT AGCAATCTCA 
GGCCTTATAA ATGOGAGCTG TGCAACTATQ 
ACATGAAAAC 
CTTTTAGCGT 
TGAATAATGA 



TCGAQAAGGA GTTCGACCTG 
AGTGQCTCGC CGGCTAOSCG 
ACTCCAGACA ATCQCCTTTT 
GCTTCTCCAC ACCX3aXX3GG 
GTGGAGGGAG CAOOCCXXAT 



GTACAGTACC 
TATAAAAACT 
TTTCACCACT 
TCCCATGTGA 
OTGCITGTCA 



GAATAGAGGT 

(xxrmccccx; 

TTTAAACAAA 
CCAGCACAI 



AGSTTTACAA ATGTGAAATT 
ACATGAAAAA ATGGtaCa«3T 
ATATTAATAC CCCTCCCTCA 

ATCGccxrrcc agcxxx^ctc 

AACAQAAGTA 



r TCTCACCQTT TGAATGCATG 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 

1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 

2220 
2280 
2340 
2400 
2460 
2520 
2SS0 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
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ATCTGTATGQ GGCAATACTA 



TTGCATTTTA C 



CATGTACAGT 
A6ACAGAATA 
TTAAATGTAT 
CRGAACAAGT 
CAC»TCGATa 
TAATAAACCT 
TAAAAAGATQ 
AACAAAACTG 
TQGGCTGTTT 
TGCAAAAGCC 
TGCTTTAATA 
GOTTGTCAAG 
AGA60CTGAC 
CAACGTGGTA 



TTTATTTTAT 
GATAGCACTO 
CAATTGOAAA 
TTATTCTATC 
GTTCTTTCTA 
TAGGAACTAG 
ACCCATTATG 
ATTATACCAG 
TGCCCAAAGT 
CTGG AACGCA 
ATGTCTTTTT 
TGOACAATCA 



GGAAGAAAAA 
ATrrGGTTTT 
CTTGTTATAA 



TCCTAGTTAA 
TATAAAAGCT 
TTTATTTTTT 
ATTAAATACA 



ACCATGCTGC 
GGAAAAACAG 
TCTATGAGCT 
AGGCCTTGAA 
AAAATATGAG 
ACTTGfTAfiCT 
TATATTGTAT 
TCATCATTTT 



AATGATAAAC 



OAGAAAlGAAT 
TTCTTTTCCA 
TTTTATTTTT 



GAGTATAAAA 
TACAGGTCTA 
CATTQAGGAQ 
TATTGAQCTT 
TTAACATAGA 
CTTTTTTAAA 



CTATTTGCCA 
CTATGAATTC 
TTTAAGAGTA 
TACACTGTGT 
TTAATTTTTT 
ACTCTGCCTG 
TAAACXTTGCT 
TACTTAAGGG 
CATTTTTTAA 
ACTTACTTGa 
AATGAATGAT 
TATAAATOTT 



TTTAAAACTA 
TTAAOACnO 
GCAGTATATA 



TTAAACAATG 
CTAGTAAGGA 
GCACCAAAAG 
TTTAAGACCT 
AACA6AAGAA 
GAACAGGTAT 
CATTTAAATG 



ATACATTCTG 
TCTTTGGATT 
AAATGTCTGT 
TTGACAAATT 
TGCCXTGGAT 
TAATTCAGCA 
TTCTCACAAC 
TCCTTTAGTT 
TGGTGAGAGC 
ATTA AATTG A 
GTTCATTTTA 
AAATAGATCC 
TGTATACCAT 
AATAATATGA 
ATAAGCTAAT 
TGACATTCTT 
CCCTAAAAGT 



AAOCCTCTAT 
CTCTAAAGGG 
AACAGAAAAA 
CTATTAAAAC 
TTGGGTGAGG 
AATGGCTACC 
TAATTTTATA 
TTAAAAGAAA 
ATGTGTAATG 
TGAAGATATT 



3060 
3120 
3180 
3240 
3300 



3S40 
3600 
3fiE0 
3720 
3780 
3840 
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ATTGAAAGGA 
TTTATTAGCA 
ATTGATACAA 



CTTTTTTATT 
TOATATCTOT 
CAGATAGGAC 
AAAAGTTGCA 
AAACTAAAAA 
ACGCAACATT 



ATCTTTTCTC 
AATTAAGTGC 
AGAATGCTGA 



TAATCAGAGA 
ACTGTACAAT 
AAAAAAATTG 



GCAAGCGCTG 



AAAAATGGTA 6TGGAAATTC 
AAAAACATAC ATTGaOGAAA 
ACTCATGTTG ATTCCTATGC 
GTATTTQAAT TAAATQTTCA 
TTTTTAACTQ TTQCTTGTTC 



ATTIGTATQC TTCAAAAAAA 
GAAOCCATAT AATGGCGOTT 
GGGCTTGrXAC ATATCCTTTT 



ACATTCCAGC ATCTTACCTT 
AAAATAAAAC CAATGTTTTG 
TTAGATTGGA AAGAATTTCA 
ACTTTTTTGT AAATGGCAAT 




TGAATGGAAA 
AAOGATTTTT 
AACACTTCAT 
ACAAOACTTQ 
CTCTTCAGGT 
TTAAATATAG 
TTTTCTGTAT 
TATT TATA TT 
CCTTTTTTGG 
ASGATAATAT 
AAQTGIGACA 



TGTACTTCAT 
TTOTTTTTAT 
TGOTOTTCAA 
TACBGAGGTT 
TTTCCCAGTT 
AACACAATCT 
TTTACTTGAC 
CAOAATACAC 



TAIGTTTAGG 
TTATCCATTT 
TGTAAAAAAA 

GAAAATGCAC 
TTCTRGAATQ 
GAAGCTTGTA 
TAGTGGAAAA 



TTACAGATGA 



ATATTAAA6A 
TTGXTATTGG 
CTTTTACTAT 
TTAATTTGAT 
TCATTACAAC 
AACCACI6TC 
TCCCTTTATT 



AACTAAATGG 
GAAAGCCCGC 
CCTTTTCTAT 
GGGAGTCACT 



CTATCACCCT 
AAGGAAAAAA 
TGCTTTATAT 
TTTAGTCAAT 
TCCTQTAATG 
ATTTATTATT 



CKSQCTTTTT ATTOTATTTa 



OAAAAAATAA AAAAAATTAA 



4140 
4200 
4260 
4320 



4680 
4740 
4800 
4860 
4920 
49B0 
5040 
510O 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
SS80 
5640 
5700 
5760 
5820 
5880 
5940 



1 

MSRRXQGKPQ 
LIFIEHKRKQ 
CPKQEHIADK 
SAWFLLQHAQ 
t«IPGSVSRE 
LRUtPKAMBP 
PPLQSAPPPS 



I 



CNGSLCLSKA 
IiLHHKGLSSP 
NTHGLHIYIjB 
ASGLAEGRFP 
PAMDFSRHLR 
QPFVKSKSCE 
SSPMTVKSDD 



PIiBAILTDOE 
VDKPPSPSPI 
RS7UIGAI.IPT 



ElAGNTSSPP 
FCGKTFKFQS 
GLSTASSPEP 



HAIiPOVMOGM VLSSMQHFSE f 
TVNGRGCSP6 ESASGGLSKK t 
AOYAASRQLK DPFLSPGDSR C 



GDHDLLTCGQ CQMNFPLGDI 60 

V6IQVTPS0D DdiSTSSKRI 120 

PGMSAEYAPQ GICKDEPSSY TCTTCKQPPT 180 

VQIPSGLGAE CPSQPFI^GI HIADNNPFNIi 240 

RHHU3PHSIE RIiQAEEraLA THHPSAFORV 300 

LSPGRPSPMQ RU.QPFQPGS KPPFUVTPPL 360 
HIiWHHRSHT 
GTSDLVGSAS 
VDYGFOLSLE 

KRGHLASAEQ HROTCDEDSV AGESDRIDDG 600 

PFSKRIKLBK EFDLPPATMP NTEMWSOML 660 

5 TPPCELDGGI SGRSGTGSG6 720 

I RfiSHTGBRPV KCELOnACA 780 
» OIKTE 



75 
80 
85 



Seq XO KO: 86 DNA E 
Nucleic Acid Accession #: XM_035292.2 
Coding sequence) 53-1576 ~ 



GQAGAAQATG CIGGCCGCCA AGAGC6CX3QA CGGCTCGGOG CCQGCAGGOG AGGGCGAGGG 
CGTGACCCIG CAGOGGAACA TCACQCTQCT CAACGGCGTQ GCCATCATCQ TGGOaACX:AT 
TATCGGCTCQ GOCATCTrOS TGACGCCCAC GGGCGTGCTC AAGGAGGCAG GCTOGCCGGG 
GCTGGCGCTQ GTOGTOTOGa CCQCGTGCX3G OOTCTTCTCC ATCGTGGGCG CGCTCTGCTA 
CGCGGAGCrC GGCACCACCA TCTCCAAATC GGGCGGOGAC TAOGCCTACA TGCTGGAGGT 
CTAOGGCTGG CTGCCXX5CCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 
ATCGCRGTAC ATCGTGGCXX: TGGTCTTCGC CACCTACCIG CTCAACiCCGC TCTTCCCXaC 



221 



5 

10 
15 



WO 02/086443 

CTGOCCQGT6 CCCGAOGAGG 
GGCCBTGAAC TGCTACAGCG 
CAAiSCTCCTO GC(3CTGGCCC 
TGTGTCCAAT CTAGATCCXa 
TGTGCTGGCA. TTATACAGCG 
CACAGAGGAA ATOATCAACC 
CATOSTGACG CTGGTGTACG 
. GCAGATGCTG TCGTCCGAGG 
GTCCTGGATC ATCCCCGTCT 
GTTCACATCC TCCAOQCTCT 
CTCCMOATC CACC CACAG C 
QACGCTGCTC TACGOCTTCT 
CAACTGGCTC TGCGTGGCCC 
TGAGCTTGAG CX3GCCX:ATCA 

CCTCTTCCTG atcqcx:gtct 

CATCATCCTC AGCGGGCTGC 
GTGGCTCCTC CAGGGCATCT 
CCCCCAGGAG ACATAGCCAG 



PCT/US02/12476 



CAGCCAAGCT CGTGGCCTGC CTCTGCGTGC TGCTGCTCAC 600 



1080 
1140 
1200 
1260 



CCGTGGCCGT 



TCTTCGTGGG 
tcctcacccc 
CCAAGGACAT 
TGGCCATCAT 
AGGTGAACCT 
CCTTCTGOAA 
CCGTCTACTT 
TCTCCACGAC 
GAGGCCGAGT 



TGAAGGCACC 
CTATGGAGQA 
CCTGCCCCTG 
CCTGGCCTAC 
GGACTTCGGG 
GTCCTGCTTC 
GTCCCB6GAA 
CGTOCOGTCC 



CGGCATGATC 
GGCCCTGCCT 
GACACCCGTG 
CTTCGGGGTC 
CXSrCCTQTGT 



GTCCAGATCG 
AAACTG6ATQ 
TGGAATTACT 
GCCATCATCA 
TTCACCACCC 
AACTATCACC 
GGCTCCGTCA 
GGCCAOCTGC 
CTCGTGTTCA 
ATCAACTTCT 

GTGTTCTTCA 
GAGTGTGGCA 
TGGTGGAAAA 
CAGAAGCTCA 
G6AGCAT6C 



TGQGQAACAT 
TGAATTTCGT 
TCTCCCTGCC 
TGTCCACCGA 
TGGGCGTCAT 
ATGGGTCCXrr 
CCTCCATCCT 



ACAGAAAGCC 
TCCTGGCCTG 
TCGGCTTCAC 
ACAAGCCCAA 
TGGAGGTGGT 



1380 
1440 
1500 



21 



31 



41 



51 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



I I I I I I 

MAGAGPKRHA LAAPAAEEKE EAREKMUWUC SADGSAPAGE GEGVTLQRWI TLUIGVAIIV 
GTIIGSGIFV TPTGVLKEAG SPGIALWWA ACX5VFSIVGA LdfABliCTTI SKSGGDYAYM 
LEVYGSLPAF LKLWIELIiII RPSSQYIVAL VFATYIiI>KPI< FPTCPVPEEA AKIiVACLCVL 
LLTAVMCYSV KAATRVQDAF AAAKLLALAL ZII.LGFVQIG XODVSHIfPN F8FEQTKLOV 
GKIVLALYSG LFAYGGWHYL NFVTEEHINF YKNLPIAIII SLPIVTLVYV LTNIAYFTTL 
8TEQMLSSBA VAVDFGNYHL C5VMSWHPVF VQUSCFGSVM OSIiFTSSSI>F FVQ3REOHLP 
SILSMIBPQL LTFVPSLVFT CVMTLLYMS KDIFSVINFF SFFina.CVAIi AIIGMIHLRH 
RKPEUERPIK VNI.ALE>VFFI LACLFLIAVS FHKTFVGCGI GFTIILSGIiP VyFFGWIWKH 
KPKNLLQGIF STTVLCQKI*! QWPQBT 



Seg ID HO I 88 DMA sequence 
Nucleic Acid Accession «: KM_00S2Ge.l 
168-989 



TAAAAAGCAA 
TCT6GATATG 
AGCCCTGAGG 



GTGATGACCA 
TTGATGAGTT 

CATGcccxrrc 

ACCGAGAAGC 



TTCTCTATGT 
ACGCAGATCC 
TTTTCACCCT 
TCATCTACCT 
TGTGCACAGG 
CGGGTGACCT 
GAGACCATGT 



11 
I 

AAGAATTCGC 
AAATTCAAGC 
AGTAGTCACT 
ACrCCTGAGT 
CTTC3VTCTTC 
CAAGGACrrC 
CTTCCCTGTG 
ACTGCTCGTG 
CCaiTGGGGAG 
GTGQAC3VTAT 
GTTCCACTCA 
ATGTCCCAAT 
CTTCATGGTG 
GGTGAGCAAG 
TCATCACCCX: 
CATCTTTCTG 
GAAGAAAACC 
GAGGCTCTftG 
GOaCAGaCAA 



CCCCGAAAAC 



GTCCACXTITG 

GGGGTCAACA AGTACTCCAC AGCCTTTGGG 
CGCGTGCTGG TGTACCTGGT GACGGCCGAG 
GACTGCAATA CTCGCCAGCC CGGCTGCTCC 
TCCCaTGTGC GCCTCTGGGC CCTGCAGCTT 
OTCATGCACG TGGCCTACCG GGAGGTTCAG 
AACAGTGGGC GCCTCTACCT GAACCX»3GC 



51 
I 

CTTCCCCGCT 
GAGCCAGGAG 
AACTGGAGTA 
CGCaTCTGGC 
CGTGTGTGGA 
AACGTCTGCT 
ATCCTGGTGA 
GAGAAGAGGC 
AAGAAGCGOG 



r CCCTCCTGTG QTCAAQTGCC 660 



AGATGCCAC6 
CACGGTACCR 
GGCTCAGACA 
ATCTTGTGAG 
CATCTCTCAT 



AGTGCCTOGC 
CCTCTTCCTG 
GTCATCCTCC 



GGGCTGCCTG 
AGGTGCAACC 
TCAGACGCTC 



CAAACAAGAC 
TCTCTTACCA 
GACTGGTCTQ 
TGAGAGTGGG 
TGGQAGOCAO 



GCTOGGTTTC CTTTTCTAGA A 



11 



CTCGTGQAGC 
GCTCAAGCCA 
GACCTCCTTT 
GACCGCCCCC 
GCAGGTTGGG 
GGAGCTAAGC 
TTCCTAGTCC 
CTGCTCT6CA 



I 



MNH3IFEGLL SGVNKYSTAP GRIKLSLVFI FRVLVYLVTA 
SHVCFOBFFP VSEVRLHALQ I.II.VTCPSU> WMHVAYREV 
OKKRGGLHWT YVCSI>VFKAS VDIAFLYVFH SFYPIOniiPP 
8EXNIFTLFM VATAAICIIaL IIIiVEI.iyi>VS KRCHECLAAR 
DDIiSGDIiIF LGSDSHPPUi PDRPRDHVKK TIL 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession ft> NM_002391.] 

Coding sequence: 26-457 



31 



SI 



I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 

CGCCcrocra gcgctcacct ccgcggtcgc caaaaagaaa gataaggtua agaagggcgg 

CCCGGGGAGC GAGTGCGCTG AGTGGQCCTG QGGGCCCTGG ACCCCCAGCA GCAAGGATTG 
OQGCQTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 
6CCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 
TGOGT6TGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTOAAGA AGGCGCGCTA 



222 



10 



wo 02/086443 PCTAJS02/12476 

CMTGCTCAO TGCCAGGRGA CCATCCGCGT CACCAAGCCC TGCRCCCCCA AGACCAAAQC 420 

AAAGGCCAAA GCCAAOAAAO GGAAGG6AAA. GGACTAGACG CCAAGCCIGG ATGCCAAGGA 480 

GCCCCTGGTG TOVCATGOGQ CCTQGCCACG CCCTCCCTCT CCCAGGCCCG A6ATQTGACC S40 

CACCAGT6CC TTCTGTCI6C TC6TTRGCTT TAATCAATCA. TGCCCTGOCT TOTCCCTCTC 600 

ACTCCCCAGC CCCACCOCTA AGTGCCCAAA GTGQGGAGG6 ACAAGGGATT CTGGQAASCT 660 

TGAGCCTCCC CCAAAOCAAT GTOAGirCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



I I I i I i 

HQHRGFLLLT UALIALTSA VAKKKDKVKK GSPGSBCAEW AHGPCTPSSK 006VGFSB0T 
OSAOniRIRC RVPCMNKKEF GADCKYKFEN MOAaxXJIGT KVRQGTZjKm RyNAQOQSTI 



Ssq ID NO: 92 DMA sequence 

Nucleic Acid Accession ft: NM_005130.1 

Ocjding sequencei 98-802 

1 11 21 31 41 51 

I i 1 I I I 

CTCTACCTGA CACAGCTGCA GCCTGCAATT CACTCCCACT GCCTGGGATT GCACTGGATC 60 

CGTOTGCTCR GAACAAGGTG AACGCXICAGC TGCAGCCATG AAGATCTGTA GCCTCACCCT 120 

GCTCTCCTTC CTCCTACTGG CTGCTCAGOT GCTCCTGOTO GAGGGGAAAA AflAAAGTGAA 180 

GAATGGACIT CACAGCAAAG TGGTCTCAOA ACAAAAGGAC ACTCTGGGCA ACACCCAGAT 240 

TAAGCAGAAA AGCAGGCOCG GGAACAAAGO CaJVGTTTGTC ACCAAAGACC AAGCCAACTG 300 

CAGATGGGCT GCTACTGAGC AGGAGGAGGO CATCTCTCTC AAGGTTGAGT GCACTCAATT 3 SO 

GGACCATGAA TTTTCCTaTO TCTTTGCTGG CAATCCAACC TCATGCCTAA AGCTCAAGGA 420 

TGAGAGAGTC TATTGGAAAC AAOTTCCCCG GAATCTGCGC TCaCAGAAAG ACATCTGTAG 480 

ATATTCCAAG ACAGCTOTGA AAACCaGAGT GTGCAGAAAG GATTTTCCRG AATCCAGTCT 540 

TAAGCTAGTC AGCTCCACTC TATTTGGGAA CACAAAGCCC AGGAAGGAGA AAACAGAGAT 600 

GTCCCCXaGG QAGCACATCA AGGGCAAAGA GACC3VCCCCC TCTAGCCTAG CAGTGACCCA 660 

6ACCATGGCC ACCAAAGCTC CC6A6T6TGT GGAQGACCCA GATATGGCAIV ACCAGAGGAA 720 

GACTGCXX:TG QAGTTCTGTG GAOAGACrrG GAGCTCTCTC TGCACATTCT TCCTCAGCAT 7B0 

AGTGCAGGAC ACX3TCATGCT AATGAGGTCA AAAGAGAACXS GGTTCCTTTA AGAGATGTCA 840 

TGTCGTAAGT CCCTCTOTAT ACTTTAAAGC TCTCTACAGT CCCTCCAAAA TATGAACTTT 900 

TGTGCTrAGT GAGTGCAAOQ AAATATTTAA ACAAGTTTTG TATTTTTTGC TTTTGTGTTT 960 

TGGAATTTGC CTTATTTTTC TTGGATGOGA TGTTCAGAGG CTGTTTCCTQ CAGCATGtAT 1020 

TTCCATGGCC CACACAGCTA TGTGTTTGAG CAGGGAAOAG TCTTTGAGCT GflATGAGCCA 1080 

GAGTGATAAT TTCAGTGCAA CGAACTTTCT GCTGAATTAA TGGTAATAAA ACTCTGGGTG 1140 
TTTTTCAAAA AAAAAAAAAA AAA 



50 1 11 21 31 41 51 

I I t I I I 

HKICSI.TI>I>S FIXUUkQVIJ. VBOXKICVKNG UISKWSEQK DTLGHTQIXQ KSRPGNXOXF 60 

VTKDQANCRM AATEQEEGIS IiKVECTQIiDH EFSCVFAC3IP TSCLKIiKDER VYWRQVARMI. 120 

HSOKDICHYS KTAVKTRVCR KOFPESSLKL VSSTLFOITX PRKEKTEMSP REHIKGKETT 180 
55 PSSLAVTQTM ATKAFECVED PDMAHQ&XTA IiEFCQETWSS LCTFFLSIVQ DTSC 

Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession #: NM_012101 

Coding sequence: 125-1891 

60 

1 11 21 31 41 51 

I I I I I I 

CTCCTCACAG OTSTGrcrCT AOTCCTCXTK} GTTGCCTaCC CCACTCCCTG COGAGACGCC 60 

TGCCAOAAAG GTCACCTATC CTQAACCCCA OCAAGCCTGA AACAaCTCRQ CCAAGCACCC 120 

65 TGOGATGGAA GCTGCAGATG CCTCCAGGAG CAACGGGTCQ AGCCCAGAAG CCAGGGATGC 180 

CCOGAGCCCG TCGGGCCCCA GTGaCAQCCT GGAGAATGGC ACCAAGGCTQ ACGGCAAGGA 240 

TGCCAAGACC ACCAACGGGC ACGGCX3GGGA GGCAGCTGAG GGCAA6AGCC TGGGCAGOGC 300 

CCTQAAGCXK. GGGQAAGGTA GGAGOQCCXTT GTTOGCGGGC AATGAGTGGC GGGGACCCAT 360 

CATCCM3TTT GTCOAGTCCO GGOAOGACAA GAACTCCAAC TACTTCAQCA TGOACTCTAT 420 

70 GGAAGGCAAO AGGTCGCOST AOGCASSGCT (XAGCTGGGG OCTOCCAASA AGCCACCCGT 480 

TACCTTTQCC QAAAAGGGCQ AOGTOOQCAA OTGCATTTTC TQGGAOTCXX: GQAAGCCCAC 540 

6GTGTCCATC AT6GAGCXXX GGGAGACCOO GCGGAACAGC TACCCCCGGG COQACAOQGQ 600 

CCrrTTTTCA CGGTCCaAGT CXXSQCTCCGA GGAGCTGCTG TGCGACTCCT GCATCGGCAA 660 

CAAGCAGAAG GCXK3TCAAGT CXrrGCCTGGT GTGCCAGGCC TCCTTCTGCX; AGCTGCaTCT 720 

75 CAAGCCCCAC CTGGAGGGCG CCX3CCTTCCG AGACCACCAG CTGCTCGAGC CCATCCGGGA 780 

CTTTGAQGCC CGC3UVGTGTC CCGTGCATGG CAAGACGATG GAGCTCTTCT GCCAGACCGA 840 

CCRGACCTGC ATCTGCTACC TTTGCATGrT OCAGGAGCAC AAGAATCATA GCAOCGTGAC 900 

ABTGGAGGAS GGCAAGGCCX3 AGAAlQGAGAC G6AGCTGTCA CTGCAAAAGG AGCAGCTGCA 960 

GCTCAAGATC ATTOAGATTC AGGATGAAGC TCAGMGTG6 CnSAAOanSA AG6ACCXX:AT 1020 

SO OUVGAGCITC KOCKCCPJOG AGAAOOGCAT CCTGGAGCA6 AACTTCCQGG ACCIGOTGCB 1080 

GGACCTGGAQ AAGCAAAAGG AGQAAGTGAQ GGCTGCSCR3 GAGCAGCGGG AGCAGGATGC 1140 

TGTGGACCAA GTGAAGGTGA TCATGGATGC TCTGQATGAG AGAGCCAAGG TGCTGCATGA 1200 

GGACAAGCAG ACCCGGGAGC AGCTGCATAG CATCflfiOSAC TCTGrGTTGT TTCTGCAGGA 1260 

ATTTGQTGCA TTGATGAGCA ATTACTCTCT CCCCCCACCC CTGCCCACCT ATCATGTCCT 1320 

85 GCTGGAGGGG GAGGGCXTGG GACSVQTCACT AGGCAACTTC AA3GACQACC TGCTCSU^TGT 1380 

ATGCATGCGC CAOGTTGAGA AGATGTGCAA GGOGGACCTG AGCCGTAACT TCATTGAGAG 1440 

GAACXACATQ GAQAACGGTO GTGACCATCG CTATGTQAAC AACIACACGA ACAGCTTCQG 1500 



223 



20 



WO 02/086443 

GGGTGAGTGG AGTGCACCGG 
TGGGGTCCGG ACATCATACC 
GWVGRA.TTTC AACAATCTCT 
CTCCTCCAGC ATTCAGAACT 
CTOCCTGAAA GGCTATCCXTT 
TTQGAAATCT GGCAAGCAGA 
CAAOGOGATT GGGTCCAACG 
CCCCTGCTCT TCCTCCTGAC 
TGGGAGGGAG CCTGGTCCTG 
GTTCCGGCCT CTCXX3ACTTC 
TGACCTCAGA TGGTCACCAT 
TAGGTTGGGG CXTTGCCCTAA 
C3W3TGAGTAC COC3CATGGTA 
CTATAQACGT TTCTCTCCAA 
ACAGCCACCC ATCTCCCATT 
C3TGCTCTCTC TCGTCCTACC 
CTGCAGATGG AAACCTCTCA 
ACAGCCACTT TGAGTCTGTG 
TAGCCAAGAT ATTCCTCTGT 
TCCACCCATG CAAATAGCTA 
TTCAGTCTAC ACTTTGGCAT 
OTGCCTTACA CACTGCCCCC 
TQATTACCCC CCATGTTGCA 



CCCTCATGOS 
CTATGCTGTC 
AAGCCCCATG 



GAGATACTCC 
TCCTGGCCGC 
AGGTAACTAC 
CCTGCCCGTC 
GAGCCAAAGC 
TCACTACCGG 
AGCTCCTGGC 



ATGTACCTGA CACOCAMGG 



GTCCAAGGCA 
CCCAAGGCCC 
CCATTCTACG 
GGAAGGAACG 



CACCTGCCCT 
CCCACTGGCC 
CATTCCTGTG 
CCCGCCAGCC 
TCAGCCTGCC 
GGCCCTATCC 
CACATGGCCC 
TATCAATGCC 
GTGTCTTGAC 
GTCCCTGGM3 
TCCCTCTGCT 
CTQGCCCAGC 
TCTCTCTGGC 
ACCCTCAGCC 
TATCAGGGTG 
TCCCGTCTCA 
TTG6GGAGGG 



CTGCAGCCCT 
ACACTCCATT 
CTCAGAGGCC 
TCCTCCTCTC 
TCTCCMGCCC 
CCCAATGTTG 
ACCTCCTGCT 
C»GCATGGCA 
ATCaCCCTAC 
GGTGGCTTCT 
GAGATAAAGA 
TACCATTTAC 
GATGGAGTGT 
GTOKCCCAT 
CTCAAGGATT 



CTGCCaGCCT 
CAGACTCCTT 
AACCCATCAC 



GCTCCTCCTT 
AGCCCCAGAC 
TCAACAAAGG 
AGGCGCCACA 
CTTGTCTGGG 
CTTGGGGGCA 
TCCTQCCTTG 
ASGGGTGAGA 



ACGCCCTGCI 
TCAGCAGATG 
TCCCAGAGGA 
GAACCreCAG 



CSrrGACTGQC 
ATTCCCTTAA 
CATTTGCCTA 
G6CTGGGCTG 
CAOAGGCTGC 



GTCTCCAGGC 
CCTGGACAGC 
CTGGCCCTAC 
TGGCCAAGGG 
GGTCTCCACC 
AGGATGACCT 
CATGATATAA 
CAQWITTTCA 



CTCCTCCTTC 
CAAAACCAGG 
GCT CTGGAAG 
CCCCTTTGGA 



1S60 
1620 
1680 
1740 

1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
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30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



MEAADASHSN GSSPEAJIDAR SPSGPSGSLE NGTKADQKDA KTTNGHGGEA AEGKSLGSAL 
K5GEGRSALP AGNBMRRPXI QFVESGDUKM SNYFSMDSME GKRSPYAGLQ LGAAKKPPVT 
FAEKGDVRKS IFSESRKPTV SIMEPGBTRR HSYPRADTGL FSRSKSGSEE V1.CDSCIGKK 
QKAVKSCLVC QASPCELHLK PHIiEX3AAFRD HQIiIiEPIRDF EARKCPVHGK TMBLFOQTDQ 
TCICYLCMFQ EHKNHSTVTV EBAKABXETE IiSLQKBQIiQL KIIEIEDBAE KHQKEKDRZK 
SFTTNEKAII. BQNPKDIiVHD LBKQKEEVRA ALBQREQDAV DQVKVIMnAL DERAKVLHEO 
KQTREQLHSI SDSVLFtQEF GALMSMYSIiP PPLPTYHVLL EQEGLGOSLQ NFKDDIJJSVC 
MRHVEKMCKA DLSiiNFIERN HMBNGGDHRY VNNYniSFGG EMSAKIOTIKR YStWLTPKGG 
VRTSYQPSSP GRFTKETTQK NFNNI.YGTKQ NYTSRVMEYS SSIQHSDNDIj PWQGSSSFS 
LK6YPSI.MRS QSPKAQPQTW KSGKQTMLSH YRPFYVNKGN GIGSNBAP 

Seq ID KG: 96 DNA sequence 
nucleic Acid Accession MM_080668.1 
83-841 

r i' t' r 

CCTTCXX33GT 
TTATGTCTGG 
CATCTCCTAC 
TCCTCCCTGA 
TAAAGAG6AT 
CTAGQATTTC 



I 



GAGCTCGAGA 
GOGCTCXX5GG 
AGQCTCTQAA 



CX3GAGCCTAG 
CCAAGGGCCC 
CTCCCQAGC3V 
CCCATCGTCT 



CAGCACTCCT 
CAGAQACTIG 
CTCTGCCTCT 
GGCAGAAGAC 
GGTTTGTGCA 



6AAATOTCTA 
ACCTCCACX:C 
TTGTCOGGAG 
AAGCCCTGGG 
GSTAAGAAGA 
ATGAATGCCQ 



TTAGGAAATG 
CCTTCCTATC 
GAATCAGTGG 
CTTCTTGGGA 
CCCXCCTGAG 



GGACACTTAG 
AGGACTGTCr 
GGGCCGCCTG 
TCCCCAAAGT 
CCXyVGCCAGA 
GGGGTGGCTG 
TTGCCTTCTG 
TGGGTTTCCG 
CTOTCTTOGA 



CT6AQGC0SA 
AGAAAGTCAG 
CAGGCCGCCXS 
TCTC3GCCAGT 
CCCCAQACAT 
AGAAAATGCC 
AGTTTQAAGC 
CrOGCCASAC 
GOTCCCCTCC 



TAAGCCTCTG 
AATCTGGCCG 
OGTGGCCCAT 
CTTTTTCTTQ 
CAAGACACAC 
GTCOVGCTCX: 
GCGTTCCTAC 
GTCCTGCTTT 
GGTGTGCTCC 



S CGGTCCGGAG 



ACCATAGCCA 
AGTTAAAGGG 
CTTGGAAATA 



GCTCATGGCC 



ACXX3AGTCTA 
CAACTATGCT 
ATAGCAATTT 
CTCATSRiTCT 



TCT6GTCTGC 
GCAAAGGGTG 
TCTGGGCTCG 
GAAGGTGGCC 
CTTGGCCCTA 
TGTAAASTCC 
TASTTTTTGG 
CTGGAGAATT 



AGAGATCTTQ 
TGCTGAGCAG 
TCTCCCTCCT 
OCTGCTCTTG 
GCTTGGGCAG 
TCACTGGTGT 
GTTTCCAGAT 
CTGAGGGTTG 
GGCCCAGGGG 
TCTTCTTGAA 
GTAAATAGTC 
GGACATTCAG 
GACCATAAGT 
TTTGGTAAAT 
TGTGGCCACC 



AAGACACX:CA 
GCTGTAGAGG 
GAGAAAGAAA 
AGCGTCCCTQ 
AAGGAAGGAG 
AGCCXSGCTGG 
GGCTTCGAGG 
AAACICACCG 
GGAATCTCCC 
AAAACGGAGC 
TTTGATCTCX: 
GTCCTGTACA 
TTACVl'G'lXfr 
CAGCGGCAGC 
CCTGTCTCTT 



GAGCCGCTCA 
AGCX3GAAATC 
GTGCGGCTGC 
TCCCAGCTGT 
ACGAGCCCCC 
CCACCCCCAC 
AGCTGGACGC 
AGACCCTGGG 
GGCrOCIGGG 



CACOVCCOGA 
TGGATGAGTG 
TGGTTQAATG 
TAGCCaCCTC 
GTGTGCTGGT 
CATCTTGGTT 
GTCGTCCTGT 



AaGTGABAQG CMXXCIGCC 



CAQAOAATGC CAGGGAAGAT 
AGGCGQGCCX TAATAAAACC 



CTTTGATTCT 

gqaagaagct 
ttgqgtctca 
tgttqagatg 
catttcaggg 
tcccttgcac 
cctcttgtca 
gaagagaaag 
tcggaaagtt 
cctcs:x:tatc 
tccccatctt 
tgcccccxsac 
gsaagtagac 

GAGTGCTGGG 

CTCTGCCAGG TCTGGGAGTC 



TTGAAGAGTC 
TCCTCGOGTA 
•CTCTCACATG 



CCCACCTGTG 
CCAGAQAGAA 
CCTGTGGAGT 
GTGTACTACA 
QCCAGGTTOA 
AGGTGCTQTO 
CTGGASTCtT 
TTTCTTACRA 



CCAGACAGC6 
AGAACACTGC 
CCRTCGTGTQ 
CACCCCTGGC 
GCAGAQQAGC 
TGGATGAAAC 
CCAGGCCATC 



TAAAGAQQTT 
TTOGTGGGCT 
CTGAGTTTTG 
CAGAAGCTGT 
TAGGGC3GCTG 
AGTTTCTGTQ 
ACCRCrCTQC 
AATAASTTAC 
TAGCATTTTG 



TGGTGGATAA 
GGTQCAGGCC 
TOCTCAACOC 



1020 
1080 - 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 

1680 
1740 
ISOO 
1860 
1920 
1980 
2040 



224 



wo 02/086443 

TCTGTGGTTT GTCAGACCT6 CAAGCAAGCC CCCTGCTCX3G CMGCCTAQQ TGTCCTTGAQ 2280 

CTGflACCGCA CIGAAGAACT CTTGTCCTCA CTOGCTOATG CAGCAGAACT CTTGGQAAAT 2340 

6TCTTAGTCC TGCAOAATCA GGAGTCACCA GATGATGCAG AGTTGAGATC ATCATTGCAA 2400 

AQTTCTCTGT TCCTGAGGAA CTAAATTTAA GGAAAAAATG GGATTTTGTT TTAOAGTTGG 2460 
AAAAAAAGCX: TGATTAAAGA GTTTCTGCCT GTTAAAAAAA AAAAAAAAAA AAAAAA 



11 



21 



31 



51 



I I I I 

MSGRRTRSGO AAQRS6PRAP SPTKPI.RRSQ RXSGSELPSI IiPEIHPKTPS AAAVRKPIVIi 
KRIVAHAVEV FAVQSPRRSP RISFFIiEKEN BPFGRELTKE DLFKTHSVPA TFTSTPVPNP 
EABSSSKEGE LDAROI.EMSK KVGKSYSRIjE TLGSASTSTP GRRSCFGFEO LLGAEDLSGV 
SFWCSKLTE VPRVCAKFWA FOMTLPGISP PPEKQKRKKK KMPEIUCTEI, C 
FEAAEQFDLL VE 



PCT/US02/12476 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 

GGGGCATTTC 
GCGGGCTCX» 
GCGGACCGCT 
TGCGTCCTGA 
GATTTOGGTT 

CCTTACTCTG 
AAATGTAAAA 
AGACTCATGG 
TT6AAAAAAA 
GGTGAAGTTC 
GGTGAACTTA 
GGATGTCTGA 



21 



GAGCCGGTGT 
GCGGTGCTGC 
GCAGCAGCCC 
TGCTTGTATT 
TCCTAAAGTT 

TTCxawKxxrr 

ATGAATTTAA 
AAATACCaOA 
ATCCTAGTGA 
AGACCCAQAT 



CXGAGCGGGC GCACGCGCX3G GAGCGGGACT CGGCX3GCATQ 
GCOTTGCTCC CTGCTGOSGC TGCAGQAGAC CTTGTCCGCT 
CCTGGCCGGT CATCAACTGA TCOGCGGCCT GGGGCAGGAA 
CGCGGTGCTG GCATTACAGA CATCTTTAGT TTTTTCCAGA 
TGTCOGGAAG TCACXCAACA GTATTGAATT TCGIGAATGT 



CTOAAaAOAT 
TTTAGCACCT 
GCCCACACAA 
CAGGTTTCTA 
ATGGAGCAGT 
GCTATCCGTG 
GACTTCATGT 
ACTGGT6AOQ 
CTGTACCTTQ 
CAQATAGACA. 



CAAGGGAGAT 
ATQCTGTGCC 
GCCTTCTGGA 
ATGTAGAATT 
ATATGGTGQC 
TTTATGGAAT 
GATATGGACT 
ACGTTGAGCT 



AATTGGAGAA 
TACAGTTTTA 
GATGATAAAT 
GACATCAGCA 
CTCACTTCTG 
TTTTAATTTT 



ATTAAGTTAC 
TTATTTAGTA 
QAAAAAGTAT 
AATGCAGAAA 
GTAAGAGAGC 
TGC7VACTTCA 
GTACTAAAGG 



ATACAAAMA 
TTCAGACTTT 
AATTCTATGG 
ATGAGCTCCT 
ACCTGTTCCG 
CCAAACTACC 
CTAAGTCCAT 
CRATTCGTCC 



CAACTACGTG 
GAAAAAAGCT 
GAAAAATGCA 
CATCAGAAAT 
TTTTGCAGGA 
CATTCAGCGC 



GCACTTTCAG 
GAAATGCATA 
GTGGATTCGA 



TGCAAGCAGA 



AAOTCTTGTT 
CCCTGGAATC 
AAAATAAACT 
ACAACAAGGA 
TTATAAACX3C 
TGTTCCTCAC 
AGTCTGTTGC 
TGGAGCACCT 



TAGAAQTTCT 480 

AGAACTTGCA 540 

AGGATTATTG 600 

CGCTTTTCTG 660 

TGTTCTGGCA 720 

GGAAGAAGAT 780 

TCAGATTGAT 840 

TGCATCTCAG 900 



OTGOTGCATC 
GAGTCTGAAT 
CCCACATACA 
GATTCTATTT 
CATTTACTTT 
ACACTTGAAA 
ATGA-rcCCAA 
GCTTTCATTA 
TTTGAACCAT 
CTCATCAGTG 
TATTTCGAGG 
TCTTGCTTTQ 



TCCTAQCTTT 
AGGGTTTAAT 
CTGAAGACCA 
AAGACTACGT 
TAGCAGATGA 
ATGATGAATT 
TACyVGACTGT 



CCGTQCTTCA 
GGATCTCTTC 
AGCATTTTTC 
TGTAAAATCC 



AGCGGCTAAC 
ATTTTGCAGA 
ATTTTCATAT 
ATTGCTTTCT 
AAAGAOTCTQ 
CTTTATTTOT QAAATTTGaC AAAGAGOrGG 



ACCTGGTGGA 
GGGTGTACTC 
GTTTCTACAA 



GGGCCAGTTC 
TCTAAACCAG 
GGGGAAGTCA 
AGACATCTCC 
TCTGTGAATT 
GTTTTGAAGA 
GAGAA7GGAG 
TTGCATCCAG 
GAGATTCTCC 
GAATTAATTT 
ATTACAOTAA 



CTTTCTGAAA 
GCAGTACTTT 
GTTATCTATT 
AAAAGATGTT 
CCAGACAGAC 
AAGOGTCTTG 
CGTGGTGATG 
CAGAGCCATA 



TGAGCTCTGA 
CCTCCAGTGA 
TTGTTGAGAA 
ATGAGGCGCC 
CTAAACCTAA 
CTGAGAAACA 
TGCAATCTAC 
GAAATGCCAA 
CIGAAOACCC 
CAGTTAAAAT 



AAGTCTGAAT 
ATTGGATCTT 
TGGTGTTTGG 
AGATTTTTCQ 



AAGGTTGCCC 
GAAAATAAAA 
AQAAAAGTAX 
GAAOCAOTAC 



AAAGATCAAC TTTTGGCCTC TTOTTTGACC TTTCTTCTQT CCTT6CCACA CAACATCATT 



GKACTOGATO 
TATACCCCCT 
AGACATGTAA 
ACTTCAGCCT 
GCCCAGAAAG 
TCAAACSAAG 
CTAG6AG6AC 
AGCTATGTGO 
AAACCIGTCA 



TTGGGCAAAG 
TATAAGCGQA 
CAACTGTATG 
GAAAGTCAGG 



TGGCA6AA6T 
TGCAGCCTTA 
TGTCAGATGA 
GATTTAATAA 
CAATATCXnr 
AAATAAACAA 



AAACTAAAQT 
CCACG CAGAT 
CGTTTCCTGT 
AGCCACTAGT 
ATACTGTTGC 



CGTTCCTGCA 
AqOCCTQAAT 
TTACAAAGAC 
GACCAAGAAT 
AGTGGTGTTA 
AGAAOAAATA 
AAATCTTCTO 
AGAGAAGCGG 



ATTCTCCCC7 
AACTGGGAAG 
AAGCATCTGA 
AGAATTAGA6 
ACAGTCACGT 



cnrrcAAACT gggcctgagc 

AATQGTCaAT TTATATTGAC 
GCCTGGATQG ATACCTGAAG 
TGTCAGCTCT TTCTCGGGCT 



TAGTACAAAT GCTTGGATCT 
GATGATGAAQ 
:TT TAGAlOaGATa 



V CASAATTAOC GCTCACAGCC 



AAGCAAATAA 
CGACTTTATA 
TTTAA TAAT A 
GAAGCCTTGG 
GGtACAATTC 
CATGTTTCTT 
TCATTGTGTT 
GAATGTCGAC 
AGATCCCCTA 



CACCACAGCA G 



TQCAOCCIGT GAACTTTTAC ATASCATGGT TATGrTTKTG 
GGACAGGGAG CCCTACCCAT GTACC3U3CTC 
CTTGCQT6TG ATGTTGATCA GOTGACAAGG 
ATTCACTGGT TCACTAACAA CAAOAAATTT 
GCrATATTQG ATG6AATTGT GGACCCTGTT 
TGXATTCGAG AATTCCTTAA ATGGTCXATT 
ACACCAAATC 
GGCTGaOAGC 



QCTGCTTCXSA 
TATGCAGCTG 
CTTACTAGAA 



TGATATACAT 
AACAGTGTra 
TAAATAAAGC 
TATTGGATCT 
ACS^TCCAT 
ATTTGTGGCT 
AGGGGGGTGG 



TGATGCCATT 
AAAGAAACGA 
GGTCAAGTGG 
TGAACTCnr 
GAAAGATGTT 



GCmCAAGA 
GAAGA6TCTC 
GCCrCKXAC 
GATCACCTAT 
CX5TTTGCCGC 
CTTTTAGCTC 



TTGTACCTTC GGQGGCCATT CAGCCTGCAS 



CTCAAGGAAG 
CCCTCGGGCA 
GCCAOGCTAT 



ATGCAGATGA 
GCCGCATCAT 
GAGGATTTCC 
ATTGTGGGAG 
TTCCTTTATT 
AAGGTGTCTC 
TCCTGGCCCA 
GCTGGCTGQA 



2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
3700 
2760 
2620 
2S80 
2940 



ATCACTTGCC 



GAAQTCCTTA 
TGAAAAGAAO 
ACCTTCCGCA 
GCCCCAGACA 3 600 



3360 
3420 
3480 
3540 



GCCAGGCAAC 
TTTTCTCATC 
GCCCACCCTC 



3660 
3720 
3780 
3840 
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GCCX3CGTTGG AGTGCTACAA CACGTTCATT GGOSAOAGAA CTQTAGGAGC GCTCCAGCTTC 3900 

CTAGC5TACTG AAGCCCACTC TTCACTTTTG AAAGCAGTGG eiTi'C Tl ' C rr AGAAAOCATT 3960 

GCCATGCATG ACATTATAGC AGC3VGAAAAG TGCTTTOGCA CTQGGGCAOC AOGTAACAGA 4020 

ACAAGCCCAC AAGAGGGAGA AAC3GTACAAC TACAGCAAAT GCACXOTTGT GGTCCGGATT 4080 

ATGGAGTTTA CCACGACTCT GCTAAACACX; TCCCOSGAAG GATGGAAGCT CCTGAAGAAG 4140 

GACTTGTGTA ATACACACCT GATGAGAGTC CTGOTGCaOA CX3CTGTGTGA GCCCX3CAAGC 4200 

ATAGGTTTCA ACATCX3GAGA CGTCCAGGTT ATCGCTCATC TTCXTTGATCT TTGTGTGAAT 42 eO 

CTQATOAAAa CTCTAAAOAT QTCCCCATAC AAAGATATCC TAQAGACCCA TCTGAGAGAG 4320 

AAAATAACAG CACAGAGCAT TGAGGAGCTT TGTGCCGTCA ACTTGTATGG CCCTGACGCG 4380 

CAAGTGGACA GGAGCAGGCT GGCTGCTGTT GTGTCTGCCT GTAAACAGCT TCACAOAGCT 4440 

GGGCTTCTGC ATAATATATT ACCX3TCTCAG TCCACAGATT TGCATCATTC TGTTGGCACR 4500 

GAACTTCTTT CCCTGGTTTA TAAAGGCATT GCCCCTCGAG ATGAGAGACA GTGTCTGCCT 4560 

TCTCTAGACC TCAGTTGTAA GCAGCTGGCC AGOGGACTTC TGGAGTTAGC CTTTGCTTTT 4620 

GGAGGACTGT GTGAGCGCCT TGTGAGTCTT CTCCTGAACC CAGCGGTGCT GTCCACGGCG 4660 

TCCTTGGGCA GCTCACAGGG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAGC 4740 

TTGTTCTCAG AAACGATCAA CACGGAATTA TTGAAAAATC TGGATCTTGC TGTATTGGAG 4800 

CTCATGCAGT CTTCAGTGGA TAATACCAAA ATGGTGAGTG CCGTTTTGAA CGGCATGTTA 4860 

GACCAGAGCT TCAGGGACCG AGCAAACCAG AAACACCAAG GACTGAAACT TGCGACTACA 4920 

ATTCTGCAAC ACTGGAA6AA GTGTGATTCA TGGTGGGCCA AAGATTCCCC TCTCGRAACT 4980 

AAAATGGCAG TGCTG6CCTT ACTGGCAAAA ATTTTACAGA TTGATTCATC TQTATCTTTT 5040 

AATACAAGTC ATGSTTCATT CCCPQAAGTC TTTACAACAT ATATTAGTCT ACTT6CTQAC 5100 

ACAAAGCTGG ATCTACATTT AAAGGOCCAA GCTGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CrCACTGGAG GCAGTCTGGA GGAACTTAGA OGTCTTCTGO AfiCASCTCAT OSTTGCTCAC 5220 

TTCCOCATGC AGTCCAGGQA ATTTCCTCCA GGAACTCCGC GOTTCAATAA TTATGTGGAC 5280 

TGCATGAAAA AGTTTCTAGA TGCATTGGAA TTATCTCAAA GCCCTATQTT GTTGGAATTG 5340 

ATGACAGAAG TTCTTTGTCG GGAACAGCAG CATGTCATGG AAGAATTATT TCAATCCAOT 5400 

TTCAGGAGGA TTGCCAGAAG GGGTTCATGT GTCACACAA6 TAGGCCTTCT GGAAAGCGTG 5460 

TATGAAATGT TCAGOAAGGA TGACCCCCGC CTAAGTTTCA CACGCCAGTC CTTTGTGGAC 5520 

OGCTCCCTCC TCACTCTGCT GTGGCACTGT A6CCIGGATG CITTGAGASA ATTCTTCAGC 5580 

ACAATT6TGQ TGGATGCCAT TGAT G TOTTQ AAOTCCAGOT TTACAAAGCT AAATGAATCT 5640 

ACCTTTGATA CTCAAATCAC CAAGAAGATG GGCTACTATA AGATTCTAGA CQTGATGrAT 5700 

TCTCGCCTTC CCAAAQATQA TGTTCATGCT AAGOAATCAA AAATTAATCA AGTTTTCCAT 5760 

GGCTCBTGTA TTACAGAAGG AAATGAACTT ACAAAGACAT TGATTAAATT GTGCTACGAT 5820 

GCATTTACAG AGAACATGGC AGGAGAGAAT CAGCTGCTGG AGAGGAGAAG ACTTTACCAT S880 

TGTGCAGCAT ACAACTGCGC CATATCTGTC ATCTGCTGTG TCTTCAATGA GTTAAAATTT 5940 

TACCAAGGrr TrCrOTTTAG TGAAAAACCA GAAAAGAACT TGCTTATTTT TGAAAATCTG 6000 

ATC6ACCTGA AGCGCCX3CIA TAATTTTCCT GTAGAAGTTG AGGTTCCTAT GGAAAGAAAG 6060 

AAAAAGTACA TTQAAATTAa GAAAOAAiGCC AOAGAAQCAG CAAATGOGQA TTCAGATGST 6120 

CCTTCXTTATA TGTCTTCCCT GTCAXATTTG GCAGACASTA CCCTGAGTQA GGAAATGAaT 6180 

CAATTTGATT TCTCAACCtSG AGTTCAGAGC TATTCATACA GCTCCCAAGA COCTAGACCT 6240 

GCCACTGGTC GTTTTCGGAG ACGGGAGCAG CGGGACCCCA CGGTGCATGA TGATGTGCTG 6300 

GAGCTGGAGA TGGACGAGCT CAATOGGCAT GAGTGCATGG CGCCCCTGAC GGCCCTGGTC 6360 

AAGCACATGC ACAGAAGCCT GGGCCCX5CCT CAAGGAGAAG AGGATTCAGT GCCAAGAGAT 6420 

CTTCCTTCTT GGATGAAATT CCTCCATGGC AAACTGGGAA ATCCAATAGT ACC3VTTAAAT 6480 

ATCCOTCTCT TCTTAGCCAA GCTTGTTATT AATACAGAAG AGGTCTTTCG CCCTTACGCX; 6540 

AAOCaCTGGC TTAGCCCCTT GCTGCAGCTG aCIGCTTCTG AAAACAATGG AGGAQAAGGA 6600 

ATTCACTACA TG6TGSTTOA OATASTOGCC ACTATTCTTT CATGGACAGG CTTGGCCACT 6660 

CCAACAQGGG TCCCTAAAGA TGAAGTSTTA GCAAATCGAT TGCTTAATTT CCTAATGAAA 6730 

CATGTCTTTC ATCCAAAAAG AaCTGTOTTT AOACACAACC TTGAAATTAT AAAGACCCTT 6780 

GTCGAGTGCT GGAAGGATTG TTTATCCATC CCTTATAGGT TAATATTTGA AAAQTTTTCC 6840 

GGTAAAGATC CTAATTCTAA AGACAACTCA GTAGGGATTC AATTGCTAGG CATCGTGATG 6900 

GCCAATGACC TGCCTCCCTA TGACCCACAG TGTGGCATCC AGAQTAGCQA ATACTTCCAG 6960 

GCTTTGGTGA ATAATATGTC CTTTGTAAGA TATAAAGAAG TGTATGCCGC TGCAGC3U3AA 7020 

GTTCTAGGAC TTATACTTCG ATATGTTATG GAGAGAAAAA ACATACTGGA GGAGTCTCTG 7080 

TGTGAACTGG TTGCGAAACA ATTGAAGCAA CATOWSAATA CTATGGAGGA CAAGTTTATT 7140 

GTGTGCTTGA ACAAAGTQAC CAAGAGCTTC CCTCCTCTTG CAGACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCTGCCAAA ATTTCATGGA GTGTTGAAAA CACTCTGTCT GGAGGTGGTA 7260 

CTTTGTCGTG TGGAGGGAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA GGACTTCGTT 7320 

CAAGTCATGA GACATAGAQA TGATGAAAGA CAAAAAGTAT GTTTGGACAT AATTTATAAG 7380 

ATGATGCCAA AGTTAAAACC AGTAGAACTC CGAGAACTTC T6AACCCGGT TGTGGAATTC 7440 

GTTTCCCATC CTTCTAGAAC ATGTAG6GAA CAAATGTMA ATATTCTCAT GTCX3ATTCAT 7500 

GATAATTACA GA6ATCCA0A AAOTGAGACA GATAATGACT CCCAGGAAAT ATTTAAGTTO 7560 

GCAAAAGATG TGCTGATTCA AGQATTGATC GATOAGAACC CTGGACTTCA ATTAATTATT 7620 

OGAAATTTCT GGAGCCATGA AACTAGGTTA CCTTCAAATA OCTTGGACCG GTTGCTGGCA 7680 

CTAAATTCCT TATATTCTCC TAAQATAGAA QTGCACTTTT TAAGTTTAGC AACAAATTTT 7740 

CTGCTOSAAA TGACCAQCAT GAGCCCAGAT TATCCAAACC CCATGTTOGA GCATOCTCTG 7800 

TCAQAATGCG AATTTCAGGA ATATACCATT GATTCTGATT GGCGTTTCCG AAGTACTGTT 7860 

CTCACTCCGA TGTTTGTGGA GACCCAGGCC TCCCAGGGCA CTCTCCAGAC CCGTACCCAG 7920 

GAAGGGTCCC TCTCAGCTCG CTGGCCAOTQ GCAGGGCAGA TAAGGGCCAC CCAGCAGCA6 7980 

CATGACTTCA CACTGACACA GACTGCAOAX GGAASAAGCT CATTTOATTG GCTGACCGGO 8040 

AGCAGCACTG ACCCGCTGGT CGACCACACC A6TCOCTCRT CTOACTCCTT GCTGTTTGCC 8100 

CACAAGAGGA GTGAAAGGTT ACAGAQAGCA CCCTTQAAQT CAGTGGG6CC TGATTTTGGG 8160 

AAAAAAAQQC TGGGCCTTCC AGGGGACGAG GTGGATAACA AAGTGAAAGG TGCGGCCGGC 8220 

CGGACGOACC TACTACGACT GCGCAGACGG TTTATGAGGQ ACX3M3GAGAA GCTCAGTTTG 8280 

ATGTATGCCA GAAAAGGOGT TGCTGAGCAA AAACGAGAQA AGGAAATCAA GAGTGAGTTA 8340 

AAAATGAAGC AGGATGCCCA GGTCaTTCTG TACAGAAGCT ACCGGCAOGG AGACCTTCCT 8400 

GACATTCAGA TCAAGCACAO CAGCdCATC ACCOCGTTAC AGGCGSIGGC CCAGAGGQAC 8460 

CCAATAATTG CAAAACAGCT CXTTAGCAGC rim 'fl'lC l 'G GAATTTTGAA AGAaATGGAT 8530 

AAATTTAAGA CACTOTCTSA AAAAAACAAC ATCACTCAAA A6TTGCTTCA AGACTTCAAT 8580 

OGrrrrcrtA ataccacctt ctctttcttt ocACCcrrra tctcttgiat tcaggacaxt 8640 

AGCTQTCAGC AOGCAOCCCT GCTQAGCCTC GACCCAGCEG CTQTTAGCGC TGGTTGCCTG 8700 

GCCAGCCTAC AGCAGCCCBT GGGCATCCGC CTGCTAGAGG AGGCTCTGCT CCGCCTGCTG 8760 

CCIGCTGAGC TGCCTGCCAA GCXJAGTCCGT GGGAAGGCCC GCCTCCCTCC TGATGTCCTC 8820 

AGATGGGTGG AGCTTGCTAA GCTQTATAGA TCAATTGGAG AATACX3ACGT CCTCCGTGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8940 

AGAASTOATT ATTCTGAAGC TGCTAAGCAG TATGATGAQG CTCTCAATAA ACAAGACTGO 9000 

GTAGATGGTG AGCOCACAGA AGCOQAGAAQ GATTTTTGGG AACTTGCATC CCTTGACTGT 9060 
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TACAACCACC TTGCTGAGTG QAA&TCACTT GAATACTGTT CTACAGCCAG TATAGACAQT 9120 
OkOttACCSXC CAGACCTAAA TAAAATCTGG AGTGAACXAT TTTATCAGGA AACATATCTA 9180 
CCTTACATGA TCCGCAGCAA GCnSAAGCra CTOCTCCAQO GAOAOGCIQA CCAGTCCCTQ 9240 
CTGACATTTA TTGACAAAGC TATGC31CGGG GAGCTCCAQA AGGOGATTCT AGAGCTTCAT 9300 
5 TACAJSTCAAG AGCTGAGTCT GCTTTACCTC CTGCAAGATG ATQTTGACAG AGCCAAATAT 9360 
TACATTCAAA ATGGCATTCA GAGTTTTATG CAGAATTATT CTAGTATTGA TGTCCTCTTA 9420 
CACCAAAGTA GACTCACCAA ATTGCAGTCT GTACAGGCTT TAACAGAAAT TCAGGAGTTC 9480 
ATCAGCTTTA TAAGCAAACA AGGCAATTTA TCATCTCAAG TTCKCXTTTAA GAGACTTCTG 9540 
AACACCTGGA CAAACAQATA TCCAGATGCT AAAATGGACC CaATGAACAT CTGQGATGAC 9600 
10 ATCATCACAA ATCQATGTTT CTTTCTCaGC AAAATAGAGG AGAAGCTTAC CCCTCTTCCA 9660 
GAAGATAATA GTATGAATGT GQATCAAGAT GGftCACXXCA 6TOACAGGAT GGAA0T6CAA 9720 

GAGCAGGAAO AAGATATCAG CTCCCTQATC AGGAGTTGCA AGTTTTCCRT QAAAATGAAO 9780 

ATGATAGACA GTGCCOGGAA GCAGAACAAT TTCTCACTTG CTATGAAACT ACTGAAGGAG 9840 

CTGCATAAAG AGTCAAAAAC CAGAGACGAT TGGCT6GTQA GCTGGGTGCA GAGCTACTGC 9900 
15 CGCCTGAGCC ACTGCCGGAG CCGGTCCCAG GGCTGCTCTG AGCAGGTGCT CACTGTGCTG 9960 

AAAACAGTCT CTTTGTTGGA TGAGAACAAC GTGTCAAGCT ACTTAAGCAA AAATATTCTG 10020 

GCTTTCCGTG ACCAGAACAT TCTCTTGGGT ACAACTTACA GGATCATAGC GAATGCTCTC 10080 

AGCACTGAGC CAGCCTGCCT TGCTGAAATC GAGGAGGACA AGGCTAGAAG AATCTTAGAG 10140 
_ - CTTTCTGGAT CCAGTTCaaA QGATTCAQAG AAGGTQATCQ OGQGTCTGTA CCAGAGAGCA 10200 

20 TTCCAGCSMS: TCTCTOAaGC TGTGCAGGCG GCIGASGAGG AGQCCCAOCC TCCCTCCTGa 10260 

AOCTGTGGGC CTGCAGCIGG GGTGATTOAT GCTTACATGA CGCFGGCAGA TTTCTGTGAC 10320 

CAACAGCTGC GCAAGGAGGA AGAGAATGCA TCAGTTATTQ ATTCTGCAGA ACTGCAGGCX3 10380 

TATCCAGCAC TTGTGGTGGA GAAAATGTTG AAAGCTTTAA AATTAAATTC CAATGAAGCC 10440 

AGATTSAAGT TTCXTTAGATT ACTTCAGATT ATAGAACGGT ATCCAGAGGA GACTTTGAGC 10500 
25 CTCATGACAA AAGAGATCTC TTCCGTTCCC TGCTGGCAGT TCATCAGCTG GATCAGCCAC 10560 

ATGGTGGCCT TACTGGACAA AGACCAAGCX: GTTGCTGTTC AGCACTCTGT GGAAGAAATC 10620 

ACTQATAACr ACCOSCAGGC TATTGTTTAT CCCTTCATCA TAAGCAGCGA AAGCTATTCC 10680 

TTCAAGGATA CTTCTACTGG TCATAAGAAT AAGGAGTTTG TGGCAAGGAT TAAAAGTAAG 10740 

TTGGATCAAG GAGGAiGTGAT TCAAOATTTT ATTAATGCCT TAGATCAGCT CTCTAATCCT 10800 
30 QAACTGCTCr TTAAOGATTO QAOCAATQAT GTAAGAGCtO AACTAGCAAA AACCCCIGTA 10860 

AATAAAAAAA ACATTGAAAA AATGTATGAA AGAATOTATG CAGCCTTGGQ TGACXXaVAAQ 10920 

OCTCCAGGCC TQGGGGCCTT TAGAAGGAAG TTTATTCAGA CTTTTGGAAA AGAATTTGAT 109B0 

AAACATTTTG GGAAAQGAGG TTCTAAACTA CTGAiGAATGA AGCTCAGTGA CTTCAACQAC 11040 

ATTACCftACA TGCTACTTTT AAAAATGAAC AAAGACTCAA AGCCCCCTGG GAATCTGAAA 11100 
35 GAATGTTCAC CCTGGATGAG CGACTTCAAA GTGGAGTTCC TGAGAAATGA GCTGGAGATT 11160 

CCCGGTCAGT ATGACGGTAG GGGAAAGCCA TTGCCAGAGT ACCACX3TGCG AATCGCCGGG 11220 

TTTGATGAGC GQGTGACAGT CATGGCGTCT CTGOJAAGGC CCAAGCGCAT CATCATCCX3T 11380 

GGCCATQAGG AGAGGOAACA CCCTTTCCTG GTGAAGGGTG GCGAGGACCT GCGGCAGGAC 11340 
.CAGOGOSTG6 AGCAGCTCTT CSTAGGTCATG AATGGGATCC TGGCCCau«3A CTCCGCCTGC 11400 
40 AGCCAGAGGG CCCTGCAGCT GAGGACCTAT AGCGTTGXGC COVTGACCTC OWjGTTAGGA 11460 

TTAATTGAGT GGCTTGAAAA TACTGTTACC TTGAAGGACC TTCTTTTGAA CaVCCATGTCX: 11530 

CAAGAGGAGA AGGCGGCTTA CCTGAGTGAT CCC3«3GGCRC CGCCGTGTOA ATATAAAOAT lis 80 

TQOCTGACAA AAATGTCAGG AAAACATGAT GTTGGAGCTT ACATGCTAAT OTATAAGGGC 11640 

GCTAATCGTA CTGAAACAGT CACGTCTTTT AQAAAACQAG AAAGTAAAGT GCCTGCTGAT 11700 
45 CTCTTAAAGC GGGCCTTCGT GAGGATGAGT ACAAGCCCTG AGGCTTTCCT GGCGCTCCGC 11760 

TCCCACTTCG CCAQCTCTCA CGCTCTGATA TGCATCAGCC ACTGGATCCT CGGGATTGGA 11820 

GACAGACATC TGAACAACTT TATGGTGGCC ATGGAGACTG GCX3GC0TGAT CGGGATCGAC 11880 

TTTGGGCATG CGTTTGGATC CGCTACACAG TTTCTGCCAO TCCCTGAGTT GATGCCTTTT 11940 

CGGCTAACTC GCCAGTTTAT CAATCTOATG TTACCAATGA AAGAAACXX3G CCTTATGTAC 12000 
50 AGCATCATGG TACACGCACT CCGGGCCTTC OGCTCAGACC CTGGCCTGCT CACCAACACC 12060 

ATGGATGTGT rTGTCAAGGA GCCCTCCTTT GATTGGAAAA ATTTTGAACA GAAAATGCTQ 12120 

AAAAAAGGAG GGTCATGGAT TCAAGAAATA AATGTTQCia AAAAAAATTG GTACOCCOGA 12180 

CAGAAAATAT GTTACGCTAA GAQAAAOTTA GCSkGGTGCCA ATCCAGCAGT CTTTACTTGT 12240 

GATQAGCTAC TCCTGGGTCA TGAGAAGGCC CCTGCCTTCA QAGACTATGT GGCTGTGGCA 12300 
55 CGAGGAAGCA AAGATCACAA CATTCGTGCC CAAGAACCS«} AGAGTGGGCT TTCAOAAGAG 12360 

ACTCAAGIGA AGTGCCTGAT GGACCA6GCA ACAGftCCCCA ACATCXmiGG CAGAACCTGG 12420 

GAAGGATGGG AGCCCTGQAT GTGAGGTCTG TGGGAGTCTG CAGATAGAAA GCATTACATT 12480 

GTTTAAAGAA TCTACTATAC TTTGGTTGGC AGCATTCCAT GAGCTGATTT TCCTQAAACA 12540 

CTAAAGAGAA ATGTCTTTTQ TOCTACSUSTT TCGTAGCATG AGTTTAAATC AAGATTATGA 12600 
60 TGAGTAAATG TQTATGGGTT AAATCAAA6A TAAGQTTATA GTAACATCAA AGATTAGGTG 12660 

AGGTTTATAG AAA6ATAGAT ATCCAGGCTT ACCAAAGTAT TAASTCAAGA ATATAATATG 12720 

TQATCAGCTT TCAAAGCATT TACAAGIGCT GCAAGTTAGT GAAACAGCTO TCTCOGTAAA 12780 

TGGAGGAAAT GTGGGGAAGC CTTGGAATGC CCTTCTGGTT CTGGCACATT GGAAAGCACA 12840 
_ CTCAGAAGGC TTCaTCAtXaV AGATTTTGGG AGAGTAAAGC TAAGTATAGT TGATGTAACA 12900 

65 TTGTAGAAGC AGCATAGGAA C3kATAAGAAC AATAGGTAAA GCTATAATTA TGGCTTATAT 12960 

TTAGAAATOA CTGCATTTGA TATTTTAGGA TATTTTTCTA GGTTTTTTCC TTTCATTTTA 13020 

TTCTCTTCTA GTTTTGACAT TTTATGATAG ATTTGCTCTC TAGAAGGAAA OGTCTTTATT 13080 

TAGGAGGGCA AAAATTTTGQ TCATAGCATT CACTTTIGCT ATTCCAATCT ACAACTGGAA 13140 

GATACATAAA AGTOCTTTGC ATTGAATTTG GGATAACTTC AAAAATGCCA TGQTPOTTGT 13200 
70 TAOaGATAGT ACTAAGCATT TCAOTTCCAG GAGAATAAAA GAAATTCCTA TTTOAAATGA 13260 

ATTCCTCATT TGGAGGAAAA AAAGCATGCA TTCTAflCACA ACAAGATOAA ATTATGGAAT 13320 

ACAAAAGTGG aTCCTTCCCA TGTGCAGTCC CTGTC CCCCC COGC CA6TCC TCCACaCCCA 13380 

AACTGTTTCT GATTGGCTTT TAGCtTTTTG TTGTTTTTTT TT TTCC TTCT AA C3VCTTG TA 13440 

TTTGGAGGCT CTTCTGTGAT TTTGAGAAGT ATACTCTTQA GTGTTTAATA AAGTTTTTTT 13SO0 
75 CCAAAAGTA 

Seq ID NO: 99 Protein gequences 
Sroteln Accession «< 1IP_00883S.S 

80 1 II 21 31 41 51 

I I I I I I 

MAGSGAGVRC SLLRLQETLS AADROGAALA GHQLIRGLGQ ECVLSSSPAV lALQTSIiVPS 60 

RDFGUiVFVR KSLNSIEPRE CREElIiKFLC IPLEKMGQKI APYSVEIKHT CTSWTKDHA 120 

AKCKIFALDL LIKLLQIFRS SRIjMDEFKIG EIiFSKFYGEL ALKKKIPDTV IiEKVYELIiGL 180 

85 LGEVHFSEHI MNAENLFRAP LGELICTQMTS AVREFKIiFVL AGCLKGLSSL XOIFTKSMEB 240 

DP(?rSREIFN FVIiKAIRFQI DIiKRYAVPSA QLKLFAIiHAS QFSTCLLDNY VSIaFEVLIiKW 300 

CAamVBLKK AALSAIiESFIi KaVSMMVAKH AEMHKNKLQY FMBQFYGIIR NVDSNHKEIaS 360 
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lAIRGYQLFA GPCKVINAKD VDn«VBI>IQ RCKQHFLTQT DTCDD R VYQM PSFLQSVASV 420 

LLYLDTVPEV YTPVLEHLW MQIDSFPQYS PKMQLVCCRA IVKVFLAIiAA KGPVLKNCIS 480 

TVVHQ6I.IRI CSKPWLPKQ PBSESEDHRA 8QBVRTGKHK VPTYKDYVDL FRHLI.SSDQM S40 

MDSIIiADEAF FSVNSSSBSI> NHU.yDEPVK SVUCIVBKU) LTIiBIOTVOS QBNGDEAPGV 600 

WMIPTSDPAA mjHPAKPKDF SAFINIiVBFC REILPEKQAE FFEPHVYSFS YEIiILQSTRL 660 

PLISGFYKLL SITVRNAKKI KYFEGVSPKS LKHSPEDPEK YSCFALFVKF GKEVAVKMKQ 720 

YKDEUASCL TPLLSLPHNI lELDVKAYVP ALQMAFKIiGIj SYTPLAEVGL NALEEWSIYl 780 

DRHVMQPYYK DILPCLDGYL KTSALSDETK NNWBVSALSR AAQKGFNKW LKHLKKIKUL 840 

SSNEAISLEE IRIRWQMLG SLGGQINKNL LTVTSSDEfW KSYVANOREK RIiSFAVPFRE 900 

HKPVIFLDVF LPRVTELALT ASDRQTKVAA CEI.LHSHVMF MLGKATQMPE GGQOKPPKYQ 960 

LYRRTPPVU, RIiACZIVDQVT BQLYSPLV14Q LIHWPTNNXIC FESQDTVAI.I, ElUIiDOIVDP 1020 

VDSTLROFCX: RCIREFLKHS IKQXTFQQQB KSPVimCSIiF KRLYSIiAIiRP NAFKRIiGASL 10 BO 

AFNNIYREFR EEESLVEQFV PEALVIYMES lALAHADEKS LGTIQQCCDA IDHLC»IIEK 1140 

KHVSIilKAKK RRIiPRGFPPS ASI^LDLVK WLLAHCGRPQ TEORHKSIEL FYRFVPLLPG 1200 

NRSPNLWLKD VLKEBGVSFL INTFBGGGCG QPSGILAQPT LLYLRGPFSL QATLCKLDU, 1260 

lAAIiECVNTF IGERTVGAIX3 VLGTERQSSL LKAVAFFLES lAMHDIIAAE KCFC3TGAAGH 1320 

RTSPQEGERY NYSKCTVWR IMEFTTTUJJ TSPEGWKLLK KDLCNTHLMR VLVQTLCEPA 1380 

SIGFNIGDVQ VMAHLPDVCV KLMKALKMSP YKDILETHLH EKITAQSIEE l,CAVin:.YQPD 1440 

AQVQRSRLAA WSACKQLHR MUJiBNIIiPS QSTOLHHSVO TEIJiSLVYKa IAP<3DERQCI> 1500 

PSLQLSCKQb ASGI.LELAFA FGGLCERLVS LLLHPAVLST ASIiGSSQCSV IHFSHGEYFY 1560 

SIiFSETIIlTB UiKHLOIAVL BLKQSSVSNT XMVSAVLHGK LOQSFRERAH QKHQGLKLAT 1620 

TILQHWKKCD SVTWAKDSPLE TKMAVUUilA KILQIDSSVS FNTSHGSFPE VFTTYISLLA 1680 

DTKLDLHIjKG QAVTIiLPFFT 8I.TGQSLEE1. RRVLEQLIVA HFPMQSREFP PGTPRFIOIYV 1740 

DCMKKFLDAL ELSQSPMLLE LMTEVLC31EQ QHVMEBLPQS SFRHIARRGS CVTQVGLLES 1800 

VYEWFRKDDP RIiSPTRQSFV DRSLI.TI.LWH CSLDAIjREFF STIWDAIDV liKSRFTKLiNE 1850 

STFDTQITKK MGYYKIU3VM YSRLPKDDVK AKESKINQVF HGSCITBGNB LTKTLIKLCY 1920 

DAFTENMAGE MQIiLERRRLY HC3UIYHCAIS VICCVFNBIiK FYQGFI.FSEK PEKNLLIFEM 1980 

IiIDIdOWYHF PVEVEVPtfER KKmriBIRXB AREAAHGDSD GPSYMSSLSY UU}STLSEEM 2040 

8QFDFSTGVQ SYSYSSQDPR PATQRFRRIIE QRDPTVHDDV IfLBtDBLMR RECMAPLTAL 2100 

VKHMHRSLOP PQOEEDSVPR DIiPSMMKFIjR OTUMPIVPI. KZRbPIAKIiV IMTEEVFRPY 2160 

AKHHIiSPLLQ LAASENMGGE GIHYMWEIV ATIMHTGLA TPTGVPKDEV lANRUUTPLK 2220 

XHVFHPKRAV FRBNLBIIKT LVBCWKDCbS IPYRtlFEKF SGKDPKSKDK SVGIQLI.GIV 2280 

MANDLPPYDP QCXSIQSSBYF QAIiVNKMSFV RYKEVYAAAA EVI/3LIIJIYV MERKNILEES 2340 

LCBIjVAKQLK QHQNTMEDKF IVCLNKVTKS FPPLADRFMN AVFFIiLPKFH GVLKTLCLEV 2400 

VLC3lVEt»ITE LYFQIiKSXDF VQVMRHRDDE RQKVCLDIIY KMMPKLKPVE LREIiLNPWE 2460 

FVSHPSTTCR EQMYNILMWI HDNYRDPESE TDNDSQEIFK LAKDVLIQQI. IDENPGLQLI 2520 

rHKFWSHETR LFSNTLDRLIi ALHSLYSPKI EVHFIiSLAIK FLI.EMTSMSP OYPNFMFEHP 2580 

LSECEFQBYT IDSDWRFRST VLTPMFVBTQ ASQGTLQTRT QEGSLSAIIWP VAGQIRATQQ 2640 

QHOFTLTOTA DGR8SFDWI.T 6SSTDPLVDH TSPSSDSLLF ABXRSERIOR APUCSVGPDF 2700 

GKKRIiOLPGD EVDNKVXGAA GRTOUUUJtR RFMROQEKLS LKYARKGVAB QKREKEIXSE 2760 

liKMKQDAQW IiYRSYRHGDI. PDIQIKKSSL ITPLQAVAQR DPIIAKQI.F8 SLPSGILKEM 2820 

DKPKTLSEKN NITQKLLQDP NRFLNTTPSF FPPFVSCIQD ISCQHAAUjS LDPAAVSAGC 2880 

LASLQQPVGI RLIiEEALLRL LPAELPAKHV RGKARLPPDV LRWVBIjAKI.Y RSIGEYDVLR 2940 

GIFTSEIGTK QITQSAIiIiAE ARSDYSEAAK QYDEALNKQD WVDGEPTEAE KDFWEIjASliD 3000 

CYNHIiABHKS LBYCSTASID SENPPDLNKI WSEPPYQETY LPYMIRSKLK LLLQGBADQS 3060 

U^TFIDKAMH GSIiQKAILEL HYSQELSIjIiY LU2DDVDRAK YYIQNGIQSF MQNYSSIDVL 3120 

LHQSRLTXI/Q SVQAIiTBIQB FISFISKQSt LSSQVPLKRI. UITWTNRYPD AKMDPt4NIWD 3180 

DIITNRCFPI. SKIEBKLTPL PEDNaWVDQ DGDPSDRMEV QEQEEDISSL IRSCKFSMKM 3240 

KMIDSARKQM KFSrAMKU.K ELHKESKTRD DWLVSWVQSY CRLSHCRSRS QGCSBQVLTV 3300 

IiKTMSUUDEH NVSSYLSKNI liAFRDQNII>L GTTYRIIANA LSSEPACLAE lEEOKARRIIi 3360 

ELSGSSSEOS EKVIAGLYQR AFQHLSEAVQ AAEEEAQPPS HSCGPAAGVI OAYHTI.ADFC 3420 

DQQLRKEEEH ASVIDSAELQ AYPALWEKM LKAIiKUISNE ARIiKFPRUiQ XIBRYPBETL 3480 

SLMTKEISSV PCWQFISWIS HMVAIiLDKDQ AVAVQHSVEE ITDinrPQAIV YPFXISSESY 3540 

SFKDTSTGHK NXEFVARIKS KXiDQGGVIQD FINAUJQLSN PBIiLFKDWSN DVRAELAKTP 3600 

VNKKNIEKMY ERKYAALOTP KAPGWAFRR KPIQTFQKEF DKHFOKGGSK liLRMKLSDFN 3660 

DITNMLLLKM NKDSKPPGNL KECSPWMSDF KVEFLRMELE IPGQYDGRGK PLPEYHVRIA 3720 

GFDERVTVMA SIiRRPKRIII RGHDEREHPP LVKGGEDLRQ DQRVEQLFQV MNGIIAQDSA 3780 

CSQRALQI.RT YSWPMTSRI. GI.IEWI£NTV TLKDUiLNTM SQEEKAAYLS DPRAPPCEYK 3840 

DHLTRMSQKH DVGAYMLMYK aANRTBTVTS FRKRESRVFA OLIJCRAFVRM STSPBAFLAb 3900 

RSHFASSHAIi ICISHWILGI GDRHIjMMFMV A^fETG6VIGI DFGHAFGSAT QFLPVPELMP 3960 

FRIiTlIQFINIi MIiPMKBTiaU! YSIMVHAI.RA FRSOF6IiI>TN TMDVFVKEPS FDWKHFEQXM 4020 

LKKQQSHIQE INVAEKNWYP RQKICYAKRK LAGANFAVIT CDEIiLLGHBK APAFRDYVAV 4080 
ARGSKDHNIR AQBFBSGLSB ETQVKCUfDQ ATDPNII^T HEGHEPHM 

Seq ID NO: 100 DMA sequence 
Nucleic Acid Accession #i NM_000673 
Coding sequence: 101-1235 

1 11 . 21 31 41 - 51 

I I I i I I 

ATOTGAAGGC ACAAGCTOCI GTTATATACA ACAGAGTSAA CIOAGCATCA GTCAOAAAAA 60 

QTCTATOTTT GCAOAAATAC AGATCCAAGA CAAAGACAG6 ATGGGCACTG CIGGAAAAGT 120 

TATTAAATGC AAAGCAGCTG TGCTTTOGGA GCAGAAGCAA CXXTTCTCCR TTGAGOAAAT 180 

AGAAGTTGCC CCACX»AAGA CTAAAGAAGT TCGCATTAAG ATTTTGGCCA CRGGAATCTG 240 

TOGCACAGAT GACCATGTGA TAAAAGGAAC AATGGTQTCC AAfiTTTCCAG TGATTGTGGG 300 

ACATGAGGCA ACTGGGATTG TAGAGAGCAT TGGAQAAGGA GTGACTACAO TGAAACCAGG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAOAGAA TGCAATGCTT GTCGCAACCC 420 

AGATGGCAAC CTTTQCATTA GGAGOGATAT TACTGGTCGT GGAGTACTOG CTOATGGCAC 480 

CACCAGATTT ACATGCMGG GCAAACCAGT ACACCACTTC ATGAACACCA OTACATTTAC 540 

GGAGTACACA GTOOTOGATG AATCTTCTOT TGCTAAGATT GATGATCCAS CTCCTCCTGA 600 

GAAAOTCTGT TTAATTGGCT GTGGGTTTTC CACTGGATAT GGOGCTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGCSTCGT CTTTGGCCTX5 GGAGGAGTTG GCCTGTCAGT 720 

CATCATGGGC TGTAAGTCAG CTGGTGCATC TAGGATCATT GGGATTGACC TCAACAAAGA 780 

CAAATTTGAG AAGGCCATGQ CTGTAGOTGC CACTOAGTGT ATCAGTCCCA AGGACTCTAC 840 

CAAACCCATC AGTOAGGTGC TGTCAGAAAT GACAGGCAAC AAOGTGGGAT ACACCTTTGA 900 

AGTTATroGO CATCTTGAAA CCATQATTGA TGCXXTTGOCA TCCTGCCACA TGAACTATGG 960 

GACCAGCXSTG GTTGTAGGAG TTCCTCCATC AGCCAAGATG CICACCXATQ ACCCGATGTT 1020 
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GCTCTTCRCT GGACGCACAT GGAAGGGRTG TGTCTTTGGA GGTTOSAAAA GCAGAGATOA 
TQTCCCAAAA CTAGTGACTG AGTTCCTGGC AAAGAAATTT GACCTGGACC AGTTOATAAC 
TCATGTTTTA CCATTTAAAA AAATCAGTGA AGGATTTGAQ CTQCTCAATT CAGGACAAAG 
CATTCGAACG GTCCTGACBT TTTGAGATCC AAAGTGGCAG GAGGTCTGTG TTGTCATGGT 
GAACTGGAGT TTCTCTTGTG AGAGTTCCCT CATCTGAAAT CATGTATCTG TCTCACAAAT 
ACAAGCftTAA GTAGAAGATT TGTTGAAGAC ATAGAACCCT TATAAAGAAT TATTAACCTT 
TATAAACSCTT TAAAGTCTTG TGAGCACCTG GGAATTAGTA TAATAACAAT GTTAATATTT 
TTGATTTACA TmGTAAGG CTATAATTGT ATCTTTTAAG AAAACATACA CTTGGATTTC 
TATGTTGAAA TGGAGATTTT TAAGAGTTTT AACCAGCTGC TGCAGATATA TAACTCAAAA 
CAGATATAGC GTATAAAGAT ATAGTAAATG CATCTCCCAG AGTAATATTC ACTTAACACA 
TTGAAACTAT TATTTTTTAG ATTTGAATAT AAATGTATTT TTTAAACACT TOTTATOAGT 
TAACTTGGAT TACATTTTGA AATCAGTTCA TTCCATGATG CATATTACTO GA1TAGATTA 
AGAAAGACAG AAAAGATTAA GGGAOGGGCA CATTTTTCAA OGATTAAOAA TCATCATTAC 
ATAACTTGGT OAAACTGAAA AAGTATATCA TATGGGTACA CAAGGCTATT TGCCAGCATA 
TATTAATATT TTAGAAAATA TTOCTTTTGr AATACTGAAT ATAAACATAG AGCTAGAGTC 
ATATTATCAT ACITATCATA ATGTTCAATX IXWIACAGTA OAAITGCAAa TCCCXAASTC 
CCTATTCACI GTGCTTAGTA GTGACTCCAT TTAATAAAAA GTQTTTTTAG TTTTTAACAA 
CTAAACCG 



1 H 21 31 41 SI 

I I I I I I 



25 MGTAGKVIKC KAAVLWEQKQ PPSIEEIEVA PPKTKEVRIK ILATGICRTD DHVIKGTMVS 
KFPVIVGHEA TGIVESIGBG WTTVKPGDKV IPLPLPQCRE CNACRNPDGH LCIRSDITGR 
GVLADGTTRF TCKGKPVHHF MNTSTFTEYT WDESSVAKI DDAAPPEKVC LIGCJQPSTGY 
GAAVKTGKVK PGSTCWVFGL GGVQLSVIMS CKSAGASRII GIDIiMKDKPE KAMAVGATEC 
ISPKDSTKPI SEVLSEMTOI NVGYTFEVIG HI.ETMIDAUI SCHMNYGTSV W6VPPSAKM 
30 LTYDPMLIiFT GRTWKGCVFG GLKSRODVPK LVTBPIAKKP DI.DQLITHVL PPKKISEQFE 
LLNSGQSIRT VLTF 

Seg ID UOt 102 DNA Sequence 
HUdelc Acid Acceeslon 4ti HM_00€783.1 
35 Coding sequence I 1..7a6 

1 11 21 31 41 51 

t I ! I I I 

. ATGGATTGGG GGAOGCTGCA CACTTTCATC GGGGGTGTCA ACAAACACTC CACCAGCATC 

40 QGQAAGGTQT GGATCAOUST CATCTTTATT TTCCGAGTCA TGATCCTAGT GGTGGCTGCC 
CAGGAAGTGT GGGGTGAOOA aCAAOAGOAC TTOGTCTGCA ACACACTGCA ACCGGGATGC 
AAAAATGTGT GCTATGACCA CTTTTTCCCG GTGTCCCACA TCCSWCTGTG GGCCCTTCCAG 
CTQATCTTCG TCTCCACCCC AGCGCTGCTG GTGGCCATGC ATGTGGCCTA CTACAGGCAC 
GAAACCACTC GCRAGTTCAG GCGAGGAGAQ AAQAGGAATG ATTTCAAAGA CATAGAGGAC 

45 ATTAAAAAOC ACAAOGITCG GATAGAGGGG TCX5CXGTGGT GGACX3TACAC CAGCAGCATC 
TTTTTCOOAA TCATCTTTQA AGCAOCXnTTT ATGTAXGTGT TTXACTTCCT TTACAATGGQ 
TACCACCTGC OCTGGGTGTT GAAAT6TGG6 ATTQAOCCCT GCCCCAAOCT TCTTGACTQC 
TTTATTTCTA GGCCAACAGA GAAGACOSXG TTTACCATTT TTATGATTTC TGC6TCIGTG 
ATTTGCATGC TGCTTAACGT GGCAGAGTTG TGCTACCTGC TGCTGAAAGT QTGTTTTAGG 

SO AGATCAAAGA GAGCACABAC GCAAAAAAAT CAOCCCAATC AT6CCCTAAA GQAQAGTAAO 
CAGAATGAAA TGAATGRGCT GATTTCaGAT AGTGOmaAA ATGCAATCAC AOOTTTCCCA 
AGCTAA 



1 11 ai 31 41 SI 

I I I I I I 

MDHGTIiHTFI GGVNKHSTSI GKVniTVIFI FRVMILWAA QBVHGDEQED FVC3ITLQPGC 
XNVCXDHFFP VSHIRLWALQ LIFVSTPALL VAKHVAYYUH BTTRKFRRGE XRNDPXDIED 
IIOaiKVRIEG SLWWTYTSSI PFRIIPEAAP MXVFYFI.YNG YRLPHVLKOG IDPCPKLVDC 
t PTIPMISASV ICMLLNVAEI. C 
) SGQNAITGFP S 



Coding sequences 86-526 

1 11 21 31 41 SI 

70 I I i I 1 I 

GGACCIGGGA AGGAGCATAG GACAGGGCAA GGCXK3GATAA GGAGGGGCAC CACAGCCCTT 
AAGGCACGAG GGAACCTCAC TGCGCATGCT CCTTTGOTGC CCACCTCAQT GCOCATOTTC 
ACTGGQCGTC TTCCCATCGG CCCCTTCGCC AGTGTGGGGA ACGCGGCGGA GCTGTGAGCC 
__ GGCGACTCGG GTCCCTGAGG TCTGGATTCT TTCTCOGCTA CTQAGACACG GOGGACACAC 
75 ACAAACACAG AACCACACAG CCAGTCCCAG GAGOCCAGTA ATGGACAGCC CCAAAAAGAA 
GAACCAGCAG CTGAAAOTCQ GGATCCTACA CCIGGGCAGC AGACAGAAGA AGATCAGGAT 
ACAGCTGAGA TCCCAOTGCG CGACATGQAA QOTQATCTGC AAQAGCIGCA TCAGTCAAAC 
ACOGGGGATA AATCTGGATT TGGGTTCOGG CSTCAAGGTG AAOATAATAC CTAAAGAOQA 
ACACTGTAAA ATGCCAGAAQ CAGGTGAAGA GCAACCACAA GTTTAAATGA A6AO\AGCTG 
80 AAACAACGCA AGCTGGTTTT ATATTAGATA TTTQACTTAA ACTATCTCSVA TAAAGtTTTG 
CAOCTTTCAC CAAAAAAAAA AAAAAA 
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75 
80 
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I I I I I I 

MLLHCPPQCA CSIXSVFPSAP SPVMGTRRSC EFATRVPEVW ICSPUUtHGG HTQTQNHTAS 60 
PRSPVMESPK ECXNQQI'KVGI LBLGSBQKKI RIQLRSQCAT WKVICKSCIS QTPGIHIiDLG 120 
SGVXVKIIPK EEHCKMPEAG EEQPOV 

Seq ID NO I 106 DNA sequence 
Nucleic Acid Accession #-. vr04139 
Coding sequence: 99-587 

1 11 21 31 41 51 

I I 1 I I I 

CATCCCTCTG GCTCCAQAGC TCSVQAOCCAC CCACAGCXrGC AGCCATGCTQ TGCCTCCTGC 60 

TCaCKCTGGG CGTGGCCCTG GTCTGTGGTa TCCOSGCCAT GGACATCCCC CAGACCAAGC 120 

AGGACCTGGA GCTCXX»AAG TTGGCAGGGA CCTGGCACTC CATGGCCATO GCGACCAACA 180 

ACATCTCCCT CATGGCGACA CTGAAGGCCC CTCTGAGGaT CCACATCACC TCACTGTTGC 240 

CCACCCCCGA GGACAACCTG GAGATCGTTC TGCACAGATG GGAGAACAAC AGCTGTGTTG 300 

AGAAGAAG6T CCrTGOAGAG AAGACTGGGA ATCCAAAGAA GTTCAAGATC AACTATACGG 360 

TGGCGAAOQA GGCCAOGCTQ CTCOATACTG ACTACGACAA TTTCCTGTTT CTCTOCCTAC 420 

AGOACACCAC CACCXXCATC CAGAGCATOA TCTGCCAGTA CCTGGOCAGA 6TCCTG6TGO 480 

20 AGGACGATGA GATCAOXSCAa GOATTCATCA GOGCTTTCAS QCCCCTGCCC AGGCACCTAT 540 

GGTACTT6CT GGACTTGAAA CAGATGGAAG AGCCGTGCCG TTTCTAGCTC ACCTCCGCCT 600 

CCAOOAAGAC CAOACTCCCA COCTTCCACA CCTCCAGAOC AOIGGGACTT CCTCCT6CCC 660 

TTTCAAAGAA TAACX»CAGC TCAGAAGA06 ATGAOSTGOT CATCTQTGTC GCCATOOCCT 720 

TOCTGCTGCA CACCTGCACC ATTGCCATGG GGAGGCTGCT CCCTGGGGGC AGAGTCTCTG 780 
25 GCAGAGGTTA TTAATAAACC CTTGGAGCAT G 

30 

1 11 21 31 41 51 

I I I I I I 

MDIFQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAFUl VHITSLLPTP EONLEIVLRR 
WEtmSCVEKK VltOEXTQHPK KFKINYTVAN EATIiUyTDYD NFI.PI>CLOI>T TTPIQSMHCQ 
35 YLABVLVEOD EIKOGFISAF RFIiPHHUtYI. LDLKQMEEPC RF 

Seq ID NOs 108 DNA sequence 
nucleic Acid Accession tt: Eos sequence 
Coding sequence: 48-794 

40 



1 11 21 31 41 51 

I I I I I I 

TCCCAGGCAG CAGTTAGCCC GCOSCCOGCC TOTGrGTCCC CAGAGCCATG GAGAGAGCCA 60 

GTCTGATCCA GAAGGCCAAG CTGGCSWSAGC AGGCCGAACQ CTATGAGGAC ATGGCAGCCT 120 

45 TCATGAAAGG CGCCGTGGAG AAGGGCGAGG AGCTCTCCTG CGAAGAfiOGA AACCTGCTCT 180 

CASTAGCCTA TAAGAACGTG GTGGGCC3GCC AGAGGGCTGC CTGGAGGGTG CTGTCCAGTA 240 

TTGAGCAGAA AAGCAACGAG GAGGGCTCGG AGGAGAAGGG GCCCGAGGTG CGTGAGTACC 300 

GGGAGAAGGT GGAGACTGAG CTCCAGGGCG TGTGCGACAC CGTGCTGGGC CTQCTGGACA 360 

GCCACCTCAT CAAGGAGGCC GGGGACGCCG AQAGCCGGGT CTTCTACCTG AAGATGAAGG 420 

50 GTGACTACTA CCGCTACCTQ QCCQAfiGTGG CCACCGGTGA CGACAAGAAG CQCATCATTG 480 

ACTCAGCCOQ GTCAGCCTAC CAGQAGGCCA TGGACATCAG CAAQAAGGAG ATGCCQCCCA 540 

CCAACCCCAT CCGCCTGGGC CTGGCCCIGA ACTTTTCOGT CTTCCACTAC QAGATCGCCA 600 

ACAGCCCCGA 6QAGGCCATC TCTCTGGCCA AGACCACCTT GQAOQAGQCC ATGGCIGATC 660 

TGCACACCCT CAGCGAGGAC TCCTACAAAG ACAGCACCCT CATCATGCAG CTGCTQCGAG 720 

55 ACAACCTGAC ACTGTGQAOG GCCGACAACG CCGGGGAA6A GGGGGGCGAG GCTCCXiCAaG 780 

AGCCCCAGAG CTGAGTGTTG CCOGCCACCG CXTCCGCCCTG CCCCCTCCaG TCCCCCACCC B40 

T6CCGAGAGG ACTAGTATGG GGTGGGAGGC CCCACCCTTC TCCCCTAGGC GCTGTTCTTG 900 

CICCAAAGGG CTCCGTGGAG AaGOACTGGC AGAGCTGAGG CCACCTGGGG CTGGGGATCC 960 

CACTCTTCTT GCAGCTGTTG AGOGCACCTA ACCACTGGTC ATGCCCCCAC CCCTGCTCTC 1020 

60 CQCACCGGCT TCCTCCCGAC CCCAGGACCA GGCTACTTCT CCCCTCCTCT T 
CTGOCCCTOC TGCCTCTGAT OBTAG6AATT G 



65 



1 11 21 31 41 SI 

70 I 1 I I .1 1 

tIERASLIQiCA KLAEQAERVE DMAAFHKGAV BKGEELSCEB RNLLSVAYKH WGGQRAAHR 
VLSSIEQXSN EEGSEEKGPE VRBXRBKVET BLQGVCDTVL 6U>DSBt.IKB AGDABSRVFY 
LKMKGDYYRY lAEVATGDDK KRIIDSARSA YQBAMDISKX EMPPTOPIRI. GbAWFSVFH 
YEXANSPEBA ISLAKTTFDE AfWDLHTLSB DSYKDSTIiIM QIiUiI]in.TLN TADNAGBBGG 
EAPQEPQS 



Seq lO NOt 110 DNA sequence 
Nucleic Acid Accession #: NM_00069S 
Coding sequence: 407-1564 



1 11 21 31 41 51 

I I I I I I 

CACOAGTTGG TTTGGGAGCT GCCAGTCTCC TGGGAGGATC GCAGTCAGCA GAGCAGGGCT 
_ GAGGCCTGGG QGTAGGftGCSV GAGCCTGCGC ATCTGGAGGC AGCATGTCCA AGAAAGGGAG 

85 TGaAGGTQCA GCGAAGGACC CAGGGGCAGA GCCCAOSCrG GGGATGGACC CCTTCGAGGA 
CACACTGCGG OGGCTGCGTO AGGCCTTCAA CTOAGGGCGC ACGCGGCCGG CCGAGTTCGG 
GGCTGCGCAG CTCCAGGGCG TGGGOCACTT CCTTCAAOAA AACAAGCAGC TTCTGOOOGA 
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OGrracTGOcc cjusqacctgc a 



ACGGTCCAC6 
CCTGGTCXrrC 
GGGCACCCTC 
A6AGAAGGTC 



CACAGGGAGC 
TGTCACCCTG 
GACG6TGGCC 
CCCTGACTAC 
CACCATCACC 
CAACCABAAA 
GGGCCAGAGC 
GACGGAGCXrr 



CTGGCTGAOa 
CCCCAGGAGA 
CCTOCSTGTGG 
GAQCTGGGGO 
AACCGCX3TGG 
GTCCTGTGCA 
C6TTTCTATG 
CAGTTCKAGC 
AACGAGAGC6 
GTGATGCAGG 



CCTGGAACTA 
ATTGCGTGGT 
TGCTGCCCCA 



GCRAGATTGT 
GCAAGAACCC 
CCTGGTTCTG 



TTTOGAGGCA 
QAACCTTCAG 
CTOGGTCTTC 
CCCATTGAAC 
GCTGAAGCCG 
GTACCTGGAC 
GCTAGAGCAC 



CACCTGCCTG 
TACCGACIGG 
GTGAGCGTCC 
GCXTATGCTC 
TGGAGCTC3TC 



AGCAGAOWSG 
GAGGGCTTCA 
ATGGGCCGGT 



CACCCGCCTC 



CTGCTACGTG 
CTACTTCAAT 
GCCCOQAGAT GCAQGAGAGG 
GOSAOGACCC CCAGAGCTCC 
ATTQCTGGGC 
CGCCCCCACG 
CGGGCCCATC 
CCX3GCAGGAG 
GATGCTGGAG 
TCTGCTGTCC 



GACATATCT6 
GCCTGGATGA 
ATCTGGAAGG 
CTGACCCTGG 
TCAGAAATCA 
CAGAGCTGCT 
AAGTTGGACT 
GCCACCAAGC 
GAOQACAACT 



AGCTCATCCT 
ABGA TBAA CC 
AACCCTTTGG 
TGCTCCTGGT 
GCCAGGGCAC 
TTGCCX3TGGT 
ACATCTTCTT 
ACCTGACGCC 
GOGACCCCCA 640 



PCT/US02/12476 



GCCGGCCAGA CCTGCGTGGC 900 



CACCCCACCC 



6AA0G6TTGA 
TTCCACCTCT 
CCCACACTG8 
AGCTCCATCC 
CTGGGGGCAA 
CCAAAATGGA 
CCCTCACACA 
AGACACAGGO 
GATGCTTACC 
TGTQACrrAC 
CCCTTGGCTG 
GGAATCCTCT 



ACATGACTGC 
GCTGCTCGAG 
TCCCCAATTC 
GTGTCACCCT 



ATCGCTACAT 
AGQAGATCTT 
AGTTCATCAA 
TTGTGAACCA 
CCTACATATC 
ACCACGQCAA 
COGGCCTGGA 
TGTTACGCTG 
CAACGGGTCA 
TTGTTCCTCC 
ATCCTGCCTG 
AGAGG CCGA G 
CAGCCCTTTG 
GGAAAATACA 
CCCTCCAGGC 
AACTQCACCA 



CTGCTGCCCG 
CCAAACCTOG 
TGOGGCOGOG 
GTGCTGGTGG 
CTGCCCATCG 
AAGCCXXTTOG 
CGQACCAGCA 
GTGCCATTCG 



CCCTGCAQAG 960 



GCCGCATCAT 



ACGTGCAGGA 

TGAAorrGcav 

CCCTGTACGC 
GCGQaMJCTT 



\ GAAATTAAAG GAGATCCOCT 



CACAGA6AAA 
AGACCGCAGG 
CCAGGGCTGC 
AGGCCGCAGA 
CCCTCTCGGT 
GTGCCCTGCC 
CTTTOCTCTC 
GCACTGCCTC 



CCIGAGTCTA 
CTCCCCCAGC 
AAAGCAAGGT 
ACATGCCAGG 
CAGGGTTGGC 
TTCTTAGGGG 
CCCTCTAGGC 



AOCCACCCTA 
GCACCCTCCT 
GOCATGAGGG 
CTCAQGTTGC 
CTTGCTTCTA 
TGTCCTCACT 
CAGGCCCAGT 
CATCAGCCCT 
ACA0QCX3CAC 
CCTCTCACAT 



GTCCCTTGAC 
GTCACTTATG 
CACATGCCCG 
CGTATGGAAA 
TACCACGGCX: 



CTGGGGTTTG 
CCAAACTCTA 



GCTCCrCCCA 



AGCACGTCCT 
GTCTCCACCA 
TAAAAGCIGC 
TGTATGOCTG 
AATAAATTCA 



21 



ATAAAATGGA GTCGGGGGGG CACATAGAAG 
TATCACCAAG ACACGCCTGC ATGTAAGACC 
CAAAGACTGT AGTATTCCAG ATGAGCTGCA 
GAAAACCATC GCCAACTCCT QCQATCAGCT 
TTACAXGGAC TTCIGTCCTT TAAAACQTTC 
GGATCCTTCC AABCACTCAT AGCCX3VGRIA 



51 



31 



HYPFYTDMNQ QUiRHG»«3SQ SCTU. 

Seg ID NO: 112 ONA Gequence 
nucleic Acid Accessloa #i NM_o 
58-2298 



1030 
1080 
1140 
1200 
1260 
1320 
13B0 

150O 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 

2160 
2220 
2280 



III 
MKDEPRSTNI) FMKLDSVFIW KEPFGLVLII APWNYPUJLT LVIiLVGTLPA GHCWLKPSB 
ISQGTEKVIiA BVIiEQYLDQS CFAWLGGPQ ETGQIiLEKKIi DYIFFTGSPR VGKIVMTAAT 
KHLTPVTLEIi GGKNPCTfVDD NCDPQTVANS VAWFCYFNAG QTCVAPDYVL CSPEMQERLIj 
PAIiQSTITRP YGODPOSSPN LGRIINQKQF QRIiRAIiLGCG RVAIGGQSHE SORYIAPTVIi 
VDVQBTEFVH QBEIFaPIU? IVHVQSVDBA ZKFINRQEKP LALYAFSKSR QWNQMLBRT 



60 
65 
70 



85 



TTTAGTTGCA 



GAATTCCGGG OGACGOGOGG GAACAACGCG AG TCGGO OOS OGGQAaaAaO 
CCAGTTT6IT GGOGGAAGCG 

ttcagacgao ctgatgaagt 
agaacqgaaa tcttaaacca 
acttctgtga gctcattgcg 
c!caac3u:»ao tcatcccatt 
tcttggtctc cc3ctacagca 
cx:ttaxatgg gaqatgaagt 
aattatgato ggaaagtaca 



GATGGTACIT ? 



ATCGTCAGAA 
TACAGOCTGT 
TGACCAGTGA 
CTTCAGTACC 
AAACTGTTTT 



AATTTTGGAA 
GCACATGCTG 
CTTGGATTTT 
CATAATGTAT 
ACATAACATT 



t TTATAAATGA TGAAATTTTT 0 



GAAAAATATA 
CCCAACATAQ 
CATACGCTTT 
ACACCCAACA 
CCACAOTGTT 



ATCACGGA6A 
AGGCCATTTC 
AAGAAOTCAC 
ATGGACX»AA 
TCTGTAGGCG 



TGATAAA6AA 
CTCAATGTTT 
CGAACAGCSia 
TGCTAAATCr 
ATGTTTTAAA 
GAAGAACACA 



GATGAAACTT 
CCAAATATTG 
GTCCTCATTG 
ACATGTAGAC 
GCTGAGGATG 
CACIGCAGAA 



GGACTGAAAC 
CGAGCTCCTC 
AACCTCCTGA 
GCACTTACTA 



Z CACCATTAAT 



GAATGTGGAG 
TGACAATTTC 
GTTTAGAGTC 
TCCAAGGAAA 
GAAAAAGGAC 



AGCCGCCCAC 
CCAGATAAGG 
CTCCCaGGCG 
GTTCAGAGAG 
TATGACTGCT 
QAAACAQCTC 



GT6CTGGAAT 
AACAATGATA 
TCTCGGTGTC 
TGC3AGTGGTG 
TGTGCCATTG 
AAAGAATCTA 



I 

- AATAATCATG 60 

TGTAAAATCA 120 

AAAGAGTATG 180 

AGAATGQAAA 240 

CGGOACTAGG 300 

AAAGACTCTG 360 

GAATTTTATG 420 

TTTAGATCAG 480 

OGGGGATAOA 540 



AAAGCnSAAA 
TCCrTCTGAT 
AGAACTAAAG 
TGAATGTACC 
ACACTCCTTT 
TTTTCATGC» 



CTCQGAAATT 
GCACAGCAGA 
CACTTCCTCC 
AGCAAAGCTT 
TCCTACATCC 
TAGACAACAA 
MCTCT 



CAAAGGATAC AGACAOTOAT 



QGCTCCTCTA 



AAACACCAAT AAAGATGAAG 
CTGAAGCCTC AATGTTTAGA 
CTAGGTTAAT TGGGACCAAA 
GCATCATAGC TCXavGCTCCC 
AACACCGGTT GTGGGCTGCA 
ACCAT6TTTA CAACTATCAA 



1080 
1140 
1200 
1260 
1320 
1380 

1500 
1560 
1620 



231 



wo 02/086443 

CCCTGTGATC ATCCACGGCA 
TTTTGTGAAA AGTTTTGTCA 
TGCAAAGCAC AGTGCAACAC 
CCTGACCTCT GTCTTACTTG 
AAGRACTGCA GTATTCAGCG 
GCAGGCTGGG SGATTTTTAT 
TGTGGAGAGA TTATTTCTCA 
ATGTGCAGCT TTCTGTTCAA 
AACAAAATTC GTTTTGCAAA 
GTTAAOOQTQ ATC3VCAGGAT 
CTQTTTQTTO ATTACAQATA 
GAAATGQAAA TCCCTTGACA 
CTTCAGOAAC CTOGAGTACT 
AATTT6CAAA aTACTGTAAG 
GCCTTCTCAC CAGCTGCARA 
TACATTTTTC AACTTTGAAT 



GGGCTCCAAA 
CAAAGATCCT 
AGATGAAGCT 
CTTGAACAAT 
TCATTCGQTA 
AGGTATTTTT 



AGTTCGIGCC 
GAGTGTCAAA 
COGTOCTACC 
GACCATTGGG 
AAGCATCTAT 
GTGCAGAAAA 



AATCCAAACT 



ACAGTAAAAA 
TGCTGGCACC 
ATGAATTCAT 
GGRAAGTGTA 
TGGATGCAAC 
GCTATQCAAA 



AGAjGTGTGAC 
tgtgtcctgc 
ATCTGAOSTG 
CTCAGAATAC 
TGATAAATAC 
CCGCAAGGGT 



GCCAAGAlGAG CCATCCAC3AC TGGCX3AAGAG 
GATGCCCTGA AOTATGTCGG CATCGAAAGA 



AATAATTTAT 
GTGTTTTQTA 
AAAGAATACr 



AGTAATGAOT 
OCAGTGAATT 
TGAACTTGAA 



ACATGCAGTT TGAAATTCTG 
TTAAAAATCA ACITTTTATT 
TTT6CAATAA TOCAOTATGa 
AAAAAAAAAA AAAAAA 



1680 
1740 
1800 
1860 



2220 
2380 
2340 
2400 
2460 
2520 



PCT/US02/12476 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



11 



21 



41 



II.... 
MGQTGKKSEK GPVCWRKRVK SEYMRLRQLK RFRHADEVKS MFSSNRQKIIj ERTEIUJQEW 
KQRRIQPVHI LTSVSSIiRQT RECSVTSDLD FPTQVIPLKT IiNAVASVPIM YSWSPLQQNF 
MVEDETVLHN IPYMGDEVLD QDGTPIEBLI KNVDGKVHGD RECGPINDEI FVEbVNALGQ 
VNDDDODDDG OOPEEREBKQ KD[>BDRSODK ESRPPRKFPS OKILEAISSM FPDKGTAEBI. 
KEICYKELTEQ <]I.FGAI>PPEC TPMIDGPNAK SVQREQSLHS FHTLFCRRCF KYDCFUIFFH 
ATPNTYKRKN TSTALDNKPC GPQCXQHLEa AKEFAAAI.TA GRIKTPPKKP GGRSRGRIini 
KSSRPSTPTI HVLESKDTDS DREASTBToa OOiDKEEEEK KDBT8SSSBA NSRCQTPIXM 
KPNIEPPEHV EWSGAEASMP RVLIGTYYDH FCAIARLIQT BTCBOVYEFR VKESSIIAPA 
PAEDVDTPPR KKKRKHRLWA AHCaiKIQLKK DGSSNHVYIIY QPCDRPRQPC DSSCPCVIAQ 
NFCEKFCQCS SECQNRFFGC RCKAQCNTKQ CPCYLAVREC DPDIiCLTCGA ADHHDSKNVS 
CKNCSIQRGS KKHU.LAPSD VAGWGIFIKD PVQKNBPISE YCGBIISQDE ADRRGKVyDK 
YMCSPLFNLN NDFWDATRK GMKIRFANHS V 
ELFVDYRYSQ ADALKYVGIB REMEIP 

Beq ID NO: 114 DKA sequence 
Nucleic Acid Accession ft: HM_00ia27 
Coding sequence: 96-335 



I 

AGTCTCCGGC 
CGCTCTOSTT 
CGGACAAGTA 
CCAAACAAGT 
AACAGAGTCT 



TTTTCAAATT 
ACAAATCTTT 
AAATGCAACT 
TTTCTCTTAA 
TATGTTOCAT 

I ID N 



11 
I 

QAGTTGTTGC 
TCATTTTCTO 
CTTCGACGAA 
ACCTAAAACT 
AGGCTGGGTT 
TCTTCCAAAA 
TAATGTATAT 
CaVTCCATACC 
QCAAGTAGGT 
GTGCCTGTTT 
TTAAAAAAAA 



CTGGGCTGGA 
CAGCGCGCCA 
CACTACX3AGT 
CATCTGATGT 
CATTACATGA 



GTGTATATAA 



31 
I 

CGTGGTTTTG 
CGAGGATGGC 
ACCGGCATGT 
CTGAAGAGQA 
TTCATGAQCC 
AATGAAGTTT 
GGTAGTATTC 
CTGTATTCTT 



2 CCGCTCTTCG 



GTGGAGGAGA 
AGAACCACAT 
ATCTGGGGAT 
AGTGAATACT 



OGTCAAATCT 
TGAGAAAT6T 
GAGCTCAGTT 



TCKCTTAAGAT AAAAO TTCTr CCAOT CAGTT 540 

GAGTTTACra AAACA6TTTA CrrTTGTTCA ATAAAGTTTG 600 



: 115 I 



Protein Accession »: NP_0 



I I I 

MABXQIYYSO KYFOEKYEyR " 
BPBPHILLFR RPIiPKDQQK 



HVMLPREIiSK QVPKTHLMSE EEWRRLGVQQ SLGWVHYMIH 



TCAGACCTCA TGAGTCACTT GGACTCTTGA G 



75 
80 



CCTTOGTGCT 
TCCAQGGAAA 
GAGTTCAAAA 
CCACATTCCC 
ACCTGGCTCA 
TTTGCTGAGC 



ATOGAOSAAG 



GCAGCCAACA 
AAGGACTGAT 
CCTTGGAAGQ 
TC31AGAATTC 
AGTGTGGGAG 
C CCAT CTCTA 
CCTGTAGTTC 
AGGCIGCAAT 
CCTOTCTCAA ATAATAATAA 



TGCAGGCTTG 
ACCAGGAAGT 
GGCTGGACCA 
ATGGTGCCTC 
GCCCAGGAGT 



CCTOAAGGAG 
GACAGACCTT 
CTCTTCGCCC 

cxsKxn:cccA 



TCCTGCTCOG 



CAGAGGGATG 
GTCCTTCTTC 
TCCCTGTCCC 
TGTCTGCTGT 
GAGAGTGAAT 
TGTAAAGAGO 



TAATAATAAT 
GAAAGAAAAQ 
CATOGCTGGA 
CTTGTGGAAA 
AGGGAACCAA 
GCCTTTGTAC 
AGTTAACX3VC 



CTCTQCAQAO OTCAAGTQAA ASCGACXSGCC 
GAAAGTACAO GGOGCTCTGT G(»6GATGGG 
TTAGCAGAAC CCCCGCGTGC CAACTGGACC 
CCTCTTGAGA GGGAGGAGCT CTGGATTTGA 
ATGCCTATAA TACCAACACT TTGGGAGGCC 
TCAAGACTAG CCTGGGCAAC ACAGAGA6AA 
TAAAAAATTA GC AGGG CATB GTGGC ATGTG 
6CAAGAG6AT GGCIGGAGOC TGGGATQTTG 

CAcrccascc igggcaaaag agoqacuuua 

CTTATTTTGa AGAATAAASA GACCTCTGOA 



GGTGACCTAC AGTTGAAGAA GACTCATTAT 
GTGTTTCCTC TGCTGCTACT GCTCATQAGA 
AGGGCTTTCT ACX»C3^CCCr 
TCAGCAATTC TTGTTTGCTC CATTATCTTC 1020 
ACTTAGGTCA AATAGQATCT AAATTTTTGT 1080 
TQTCTTGCakA GGAGCCTTGG AATAGTAACT 1140 



232 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 



WO 02/086443 

CTTCrCA.TTT GTTTGGGKIC 
ABTTCATCAO aCTCTOGGAC 
6TGAAGGCTC GTGTTCTCCA 
rraGAAGGGC AAAAAATGAA 
ACAQTCTGCT QTGAAQACCT 
GCTTCATGAG AGRCTGACAG 
CAGTGTGK3C TGATGACACA 
ACTCTGTAGC CAACATACAC 
CTXGTCCAAA TGCAGAGTCA 



tqsccacx:aa 
cttagggctg 
.tcctcaactt 
cactgtcgtt 
tctctcaagt 
ctatcagggg 
tacacacxtg 
atgatttaaa 



QTTCCAGAAT 
TXGGAGAAGG 
TCTTTGCTTC 



V TAATTAAAGT 



GGCATTTGGG 
TTGTGGCACT 
ACAATAGCTT 
ACCCTTTCTA 
TACTTCATTA 
TTTTGTATGT 



GATACACX3GA 
CTTCAGCAGC 
GATCATACAC 
GTGTTTTGTG 
AGTCCATGCC 
TAGTQAGGAC 
GAGTCTTCTC 
AATATCTATC 



TGCAAAAAAA 



TCSJSTGCAGA 
AGAACTGATG 
AAGAATACAT 
ACACAGATGC 
AGATCATGGT 
TCTCCTCCCC 
TGTTCCTTTT 
ATGGTTCATC 
GGCGAATAGT 
AAIVAAAAAAA 



PCT/US02/12476 



1560 
1630 
1680 
1740 



Seq lO NO: 117 DNA sequence 
Nucleic Acid Accession S: BC012178.1 
204-2285 



31 



CTGGAGGAGA 
CTGGTQCTCA 
AAATTTTCCC 
TCATCTCTGG 
TATTCACTAT 
TATTTGGAGG 



I 

: GCGGCGCTGG 
CTCCTCTCCG 
CCGACCCTTC 
GGCCCTGGCC 
CCTTAAGGAT 



GGCCCGCGCT 



I I 
CCGCTGCTGT TGCTCCATTC G 
CTCCTCGACX: AGQCCTCCTT C 
CCGCCCCGTC TCGTACTGTC G 
TGTGCAACGG AGACTCCAftG C 



GTAOGGGAAA G 



GGCGCTTTTC 
CTCAACCTCA 
GCCGTCACCG 
croOAGAATG 

GGCCRCCACC ACTATGAAGG AGCTGTTGTC ATTCTGGATG 



ATGGAGATAG 
TAGTAGCAGG 
TTGaCCTTAC 



OAOTAGGCAC 
CAGCTTTQCT 
GCTTTATGAG 
AGGTCAAAGT 
CAGATGAAQA 
CTGAAGAGAA 
GASAAATGAA 
TAATTGAAAG 
ATOACAC31QA 
ATTTTCRTAA 



AGGACCTAAT 
TGGCAAQCCT 
TACTGTGCAC 
ATGTTCATTA 
TGTAGACAAA 
CATAGCAAAT 
AOAAAATGGA 
CTTCACCGTO 



GTTCTTGGAA 7 



AAAGTAATAC 



TTCftGAAGGA 
GATTCAAGOT 
ABTTATATG6 
TGAAGAATTT 
AACTTQAOTa 



AGAAGTTGTT 
TGTGGCACGT 
AGCACAGTTC 
CCITTATGAT 
XAIT0GASA6 



AAATGGTGCT 
AAAACXSAGAA 
GATAAATGCT 
TAGAACCCCA 
AAOAAAAATC 
CTTGAAACCA 

TGCATCCxrrr 



TTGAACCAA6 A 



GCTCATTCTT 
CGGAAAAGAA 
ATTGGGGATA 
GAGGAGGTTT 
GTTQCAAGTG 



CTTATATTTG 
CTGCAAGTGT 
AGGATCAGGA 
CAATTAAAAC 
CCAGTAAASA 
TGTGTCACAA 
CAGAT6TTAC 



AOATGAAGTG 
TCCATTTCCA 
TAAGGACTTT 
TAAAAAQCCA 



CCTGAAACCA ACAATATTTT 



TGTAGGTGTG 
TGAACCTGAC 
CGTTAACAGA 
TCCCACTTTC 
TAACATTCIC 
ACCATTACAT 
TATTCGAACC 



TATTTGGCCC 



AQGTACTTTA 
ACTCATCAAA 
AGTAATAQAA 
TGQACTTCCA 
AGTAATATOT 
GAAAATAOTA 
CAAAGCCT6C 
ACTGAATGCC 
CAGTTACGTG 
GGCTAGGCTT 



08T0CTATTA 420 

GATCCAGCAA 480 

ATQAATAAG6 540 

AACATTAGTG 600 

TTGCTTACAC 660 

TCTGGAAACA 720 

CACCCTGAAO 780 

ATAGCTGOAT 840 

ATCAAAGASA 900 



1320 
1380 
1440 
1500 
1560 



ATTGATAAIG 
CTTGQAATTC 
CTACCAATAT 
ACCACAAGTC 
GAAGTAATTG 
aSGCCTGATC 
ACXXATCACa 
CCTCTGAAAO 
GAAGAGTTAQ 



GCTGATTTTT 
ACAACAGAAG 
TTCTTGCTOC 
TGTGQAATCT 
ATACCTCGCA 
OAACCTCCTA 
CAAGCTGATT 
CASATGCCGG 

TGATTTTGAC ACCATTACAT TTTOATCGQa ACCCACTTCA AAAOCAOCCT TCATOCCAQA 

CCTSCAACAC 

CIGGCAATQA GATCCCTGTA GAGaTGOTAT TAAAOATOaT CftCIGAGATT AABAAGATTC 
CTGGTATTTC TCQAATTATG TATGACITAA CATCAAAGCC CCCAGGAACT ACTGAGTC3GG 
AGTAATAAAC TTCTTOTTCT ATTAAAA 



CAAATTACX3\ 
CAGGCXGACT 
TOGGAATCAC 
GTTGTTTATA 
TTOACAACAG 
AGOSAGTCTa GSTATaCTOa GAAAATCASC 
TTTOATCGQa ACCCACTTCA AAAGCAGCCT 



1920 
1980 
2040 
2100 
2160 
2230 
2280 



60 
65 
70 
75 
80 



APAIKEQGFR 
KSVREDGVFN 
SKKLYGAQFH 
VLLSGGVDST 



EVFItAQGTIiR 
ILGRELGIaPE 
TLLQRVKACT 

ESGYAGKISQ 



PDLIESASI.V 
ELVSRKFFPG 
TBBDQBXU4Q 



VYAEDAPWFD 
RGLQKBBWI. 
VIIiiaiFI<YDI 
NQEQVIAVHI 
KHISKTLNMT 
ASQKAEI.IKT 
F6LAIRVICA 
ITSLHSUaAF 



LDAGAQYGKV 
PAIFTIQKFV 
LTBGDSVDKV 
AGCSGTFTVQ 



LGICS<^MM 
ADGFKWARS 
NHKTiBCIRBI 



TSPBKKRKII 



GDTFVKIANE 
liSEEGKVIBP 
BTHNILKIVA 
GDCRSYSYVC 
TTGVIiSTLRQ 



51 
I 

QSEIFPLBTP 
NKVFGGTVHK 
GNIVAGIANB 
KERVGTSKVl. 
GIQVKVINAA 
VIGEMHI.ICPB 
LKDPHKDEVR 
DPSASVKKPH 
GISSKDEPDH 
ADFEAHNILR 



? DROPIKiKQPS CQRSWIRTF ITSDFHTGIP ATPGNEIPVE 660 



Seq ID NO> 119 ONA sequence 

Nucleic Acid Accession 8: NM_006500.l 

Coding sequence t 27.. 1967 



ASTCCCAAGQ CAACCTCAGC CATGTCSACT C 



r TCTGAACTGC GGCCMCTCCC 180 



233 



TCATCTTCCG TGTGCGCCAG GGCCAGGGCX: AQAGCOAACC TGGGGRGTAC GAGC3«3CGGC 300 

TCAGCCTCCA GGACAGAC3GG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 

«3CRTCTTCTT GTGCCAGGGC AAGCX3CCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTQTGAACa 480 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGOTA CAAGAATGGC CGGCCTCTQA AGGAGGAGAA GAACCGGGTC CACATTCAGT 600 

CGTCCCAGAC TGTGGAGTCX3 AGTGGTTTGT ACACCTTGCA GRGTATTCTG AAGGCACAQC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTQAGCT CAACTACOGG CIGCOCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA COSTCCCTGT TTTCTACCCS ACAGAAAAAG 780 

TGTGGCTGGA AGTGQAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 

GTTTGGCrOA TCGCAACCXTT CXACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGOTCCT GGTGCTGGAO CCTQCCCGOA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GQACACCATG ATATCXKTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTQAACT ATGTGTCTGA OGTCCX3AGTG AGTCCCGCAG 1080 

CCX:CTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCIG TGAGGCAGAG AGTAGOCAfiG 1140 

ACCTCX5AGTT CCAGTCGCTQ AGAGARGAGA CAGACCAGGT GCTOGAAAGQ GGQCCTSTGC 1200 

TTCAGTTGCA TGACCTGAAA CX3GGAGGCAG GAaGGOQCTA TOGCTaOSTO aOBTCTGTQC 1260 

CCAGCATACX: CGGCCTGAAC CGCACACAGC TGGTCAAOCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGQ AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTQTCTT 1380 

GTGAAGCX3TC AGGGC3VCCXX: CGGCCCACCA TCTCCTGGAA CX3TCAACGGC AOGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC ISOO 

TQTTGGAOAC AGGTQTrQAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCX: 1560 

TCTTCCTGGA GCTGGTCAAT TTAACC»CCC TCAC»CX»GA CTCCAACACA AOCACTQQCC 1620 

TCAGCACTTC CACTGCX»GT CCTCATACCav GAGCCAACAO CACCTCCACR OAOAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGOGTGGTCA TCOTGaCTOT QArrGTOTQC ATCCTOOTCC 1740. 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGRA GGGCAAGCTO CC3STGC3«3GC 1800 

GCTCAGGGAA GCAGGAGATC ACGClGCXrCC CGTCTCGTAA GACXS3AACTT GTAGrTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAftGAGATQG GCCTCCTOCA GGGCAOCAGC GGTGACAAQA 1920 

GGGCTCOGGG AGACCAGGGA QAGAAATACA TCX3ATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCX» GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAQAa 2100 

GGCCACTGGG TTAGGACCTG AGGRCCTCAC TTGGCCCTGC AAGCOGCTTT TCAGGGRCCA 2160 

GTCCACC3VCC ATCTCCTCC3V. OGTTGAOIGA AQCTCATCCC AAGCAAG6AG CCXXSUSTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CITGCM3AAC GSCeTTTTTrC TnACACACA TTATGGCrGT 2280 

AAATACCTGG CTCCTGCCAa CAGCICAGCT GOGTAOCCTC TCIGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACXaT CCAGGTQC3VC CRCTOAAOTO AGQACACACC GOAGCXaWSGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATOC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACXaCC CTCCTGCTCG CCTCTTCauUV GTCTCCTGTG 2520 

ACATTITTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TAOQTGCCGG 2S80 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCXXW GGCBGGCGGA 2640 

TCACAAA<3TC AGQACQAGAC CATCCTCGCT AACAOSGTGA AACCCTGTCT CXACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG aCGTAOTGOT TGGCACCTAT AGTCCCAGCT ACTGGGAAG6 2760 

CTGAAGCAGG AQAATGQTAT OAATCCAGQA GSIOGAGCTT GCAGIGAGCC GAGAC0GT6C 2820 

CACIGCACTC CAGCCTOGGC AACaVCAGCQA OACTCCOTCT COAGGAAAAA AAAAGAAAAG 2880 

ACGOGTACCT GCGGTQAGGA AGCIQGGOSC TGrrTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTCTT CACTTGCTCC CATAGCCCTC TTGATGGATC AGGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATQG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CXTTAGAAGGG CCXSkAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTOTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTOTQTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATCTA TOTATATATA TATATOAAAA XATATATATA TATGAAAAAT 3300 

AAAiGCTTAAT TCTCCCAGAA AATCATACAT TGCTTTTTTA TTCT ACaT GG GTACCACAGG 3360 

AACCrGGGGG CCTSIOAAAC TACAACCAAA AGGCACACAA AACCGTTTCX: AGTTGaCAGC 3420 

AGAGATCAGQ GGTTAtXrrCT GCTTCTGAQC AAATOGCTCA AGCTCTACCA GAGCAOACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC OTATGAOGCSV GCACGAAGGG CCTG6CAGGC 3540 
TGTTAGCavaG AGCTATGTCC CTTCCTATCG TTTCXSSTCC31 CTT 



Seq ID KOt 120 Proceln cequence: 
Protein Accession «: NP_006491.1 



1 11 21 31 41 51 

MQURLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTAIiUCaGIi SOSOSOiSHV 60 

DHFSVHKEKR T1.IFRVRQGQ GQSEPGEYEQ HLSIi(»)RGAT tiAIiTaVTPQO ERIFLOQGKR 120. 

PRSQEYRIQIi RVYKAPEEPN IQVHPliGIPV NSKEPEGVAT CVGSNGYPIP QfVIWYKHCBlP 180 

LKEEKNRVHI QSSQTVBSSG LYTLQSHJUV QLVKEDKDAQ FYCBLNYRI* SGNH MKESRE 240 

VTVPVFYPTB KVWLEVEPVG MIJCEGDRVEI RdjADGHPPP HFSISKQNPS TREAEEETTK 300 

DNGVIiVLEPA RKEHSGHYEC QAWHLDTmS LLSEPQEIiLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCBAESS QDLBPQWLRE ETDQVI.ER6P VLQLHDLKHE AGGGYRCVAS VPSIPGLNRT 420 

QLVKIjAIPGP PWMAPKERKV WVKENMVUn. SCEASGHPRP TISMNVNGTA SEQDQDPQRV 480 

LSTUIVLVTP EI.I.BXGVECT ASNDLGKNTS II.FLEI.VNLT TLTPDSNTTT 6LSTSTASPH 540 

TRANSTSTBR KLPBPESRGV VIVAVIVCIIi VIAVLQAVI.Y FLYKKGKLPC RRSGKQEITL 600 
PFSRKTELW EVKSDKLFEE MGLLQGSSGD KRAPGDQ6EK YIDLSB 

Seq ID NO I 121 UNA sequence 
Nucleic Acid Accession #3 NM_01830S 
Coding sequence: 60-671 

1 11 21 31 41 SI 

ATAGTCTACA CAGAGCTCCC CTTGCTGCCC A6ACAAOCTQ AAG6ACCACA GGAAAAOCCA 60 

TGGAGACTTC AGCATCCTCC TCCCAGCCTC AGGACAACAG TCAASTCCAC AGA6AAACA6 120 

AAGATGTAGA CTATGGAGAG ACAQATTTCC ACAAGCAACA CXnGAAGGCT G6ACTCTTTT 180 

OCXS^AGAACA ATATGAGAGA AACAAGTCIT CTTCCTCCTC CTTCICTTCC TCCTCATOCT 240 

CCTCATCTTC TTCATCCTCC TCCTCCTCWG GTCCTGGGCA TGGGQAGCCT GACGTTTTOA 300 
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AGGATGAGCT TCAACTCTRT 
OACTCCGAAG GAGAQGCTCT 
GACTGAATAT AAAGAAAGAT 
GGGCCTTGCT GGTGTGTTAT 
TGCTCACCTT CGCXTTCCXTTG 
ACAGCX5TCCT CCAAGGCTTC 
AGACTGACTG AGGCCACTTC 
QCGACCCCTQ AGCCCAOVAG 
AGACCAATAA ACAGAACACT 
AGGGGCATCT CATTTGGGCA 



GGAGATGCTC 
GACCCAGCAA 
GATGAGTTTT 
CACTATTACQ 
GAAACCGTT6 
ATCCOCCTCT 



ACTGCrCTCA GAGGACAGCA 
ATGGTTTTTC TCAAATCCCA 
GGGAGAGATG GATGGTCCAC 
GAACCTGATO CAGGTAASAT 
GGCAQGAAAA TOATCATCAG 
TCCTCGCACT TTGG6AGGCT 
CTGGGCAACA TAGTQAGACC 
TACATACCTQ TACATACCTQ 



TTTCCTTCCA 
QTACTGCTGT 
CTTCACAGTA 



CTGGAGAGGT GGTACCCTCT GGGGAA-TCAG 360 

GTGGAGAAGT GGAGGCCTCT CAGTTAAGAA 420 

TCCATTTCGT CCTCCTGTGC TTTGCCATCG 480 

CAOACTGGTT CATGTCTCTT OGGGTCGGCC 540 

GCATCTACTT CJGGACTAOTG TAOSSTATCX: 600 

TCCAGAAGTT TAGGCTQACA GGGTTCAGGA 660 

GC3U»GGCAG GCCCCAGTGT GACCACCACT 720 

CATTCTGAGA QACX»:AChGG AQACCAAOCC -}80 

TGTGGTCTGA ATGTTGGCAC CSUSCCOGGGC 840 

GCAACCCAX3C TGCAAflGATG GAAGGCAGAO 900 

CCTGGACCAG CAGGAA6ATT CTGGGAGGTC 960 

AGCTCTGCAA GCTOTGATCT OTCTGGGTTC 1020 

ATGCGCTCTC AGGTGCTACC GAGCCATCCT 10 80 

CAGGGAGCCA TCGGGCTGGG CXXCCrVSGT 1140 
AAAACCATTT TTTTTGCACC C3U\AAAAAAA 



PCTAJS02/12476 



GGCTATCTGC 
TGCTTTGAGG 
GCTGAGGACT 
AAAGTAAATQ 

CAGGCTAAflG GTCBCTTQAA GCTaAaAGTT CAAOACCAAC 

cccavTcrcTA c3u«tttttt ttaatgacx:a AAiarGOOSG 



CGACAGAGCA AGATCGTTTC TCTAAAATT 



1260 
1320 
13fi0 
1440 
1500 



SSSSSSSSSS GPGHGEPDVL KDELQLYGDA 
RLNIKKDDEF FHFVI.LCFAI GATiLVCYHYTf 
BSVIiQGFIPL FQKFRLTGFR XTD 



Seq ID MO> 123 DNA e 
Nucleic Acid AccesBlon #> BC022S43 
Coding sequence: 243.. 896 



ADWPMSLGVQ LLTFASIiETV GlYFGLVYRI 




TTTTAATCAA 
TTAAACAGGA CATTCCTGCA 
AGAGAAACAT AACAGA6GCA 
GGAGTCTGAA 
CTTTTTGCCT 
TGTGGTCAAT 
CTGGGCTCAC 
QAACAAGATG 
GACTGTACAT 
ATTGATCCTT 
AATGCTTCCT 
ATTAATTACT 
TTGGQATCAT 
CCTCAAGCAT 
GAGAAGTGAC 
ATTATTCTCA 
TAAGAATAAA 
CQAGGTGGGC 
TACTAAAAAT 
OAAGGGTGAG 



GTTTTCAAGC 
AAGCCTCGAT 
TTTTGAAATQ 
TATGCCAATG 
CAGTGGGACT 
TGTGCTCTAC 
TATGTAGTTA 
TTCTTCTAGA 
TAQAQGAAAT 
CCTTGGTAAT 
TTTATTTTGT 
GTAGAGACyiA 
COSTTATAAA 



GTGCACTGCC 
AACCCAGATT 
TC3«3AAGTGG 
AAGTATAAAT 
ACCTCT CTAG 
GTAGCAGTTT 
AGAAACCTAA 



TCTCAGCTAA 
CAGATGCCAT 
TTTATCTTCA 
TTTTGCAAGT 
ATAGGCXAGG 



TTGAGGACTT 
TGOATCCGTA 
CAQAAAATTT 
ATGCCAGACG 
GCTATCATC6 
TGTTGATGTT 
CAGCCCCTTG 
CAGTATATAA 
TATGTTCTQT 
TCAAATATGG 
ATAAGATCTA 
TCATTT 



GCACAOTCaC CQTCTCTTAA 360 

TGAGTTGGCT TCATTAOGAG 420 

TGATATAGAG GCCX:CTAACT 480 

AGATTCACAG TGCATTGACT 540 

GAAGATGGAG 600 

GAGTTCCCGA 660 

AATGAGGATA 720 

CTACAAGTTC 780 

ATTACAATCC 840 

CTATAAGTTT 900 
ACGAGAGGT6 



TTGTGACCAA 
TGCTTTGGAT 
GAATGTGATT 
QACTCTQCTC 
CCATTTTTCC 
TTAATTTCTG 



rrtAC 



ACAAAAGrXA 



TTCCAAAATG TAGTGCTCTA TIGCATGGAT 
AAGGGGAAAC TTAATTCT6C T AAAT TAATO 
TTTGQGGTAG AAAAATTATT TCTTTATGTA 
ACTTTCAATT TAAGCTACAA ArrOAGAAAA 
CACAOTOGCT CACACCTGTA ATCCCAGCAC 
A6QTCAAGAG TTTGAGACCA GCTTGGTaAA 
GCTGGGGCTG GTGGTGGGCA TCFGTAGTCC 
CGCTTGAACC TGGGAGGOGG AGGTTCCAQA 



GGAAAAACAA AAAAOAAOAA TAAAATAATT TOSAXGAAAA TCATSTTTAT 
ATGTCATGAQ ACTATTAAAG ATGTOCCAGA GTTTCAA1GA AAATCATTAA 
CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA 1 
ACGCACTCCA TTCTCCTTTT ACRTTTTATC ATGTTTCTTT TGAATATATO 
GOACnGATG AAACTGAGTA CTAAQATTta OTACAGAGTA TOTCAGGAAG 
TTGCCATTTT AAATAAAGTT OTACATGAAC AAAAAAAAAA AAAAAA 



1440 
1500 
1560 

TTAAATA6TA 1620 
1680 
1740 
1800 
1860 



75 

80 



21 



41 



I t J I I I 

MCSEIILRQE VLKEGFHEDL LIKVKFGESI EDLHTCRIil KODIPAGLYV DPYEIiASUlB 
RNITEAVMVS ENPDIEAPNY LSKESEVLIY ARHDSQCIDC FQAPLPVHCR VRBPHSED6E 
ASIWNNPDL LMPCDQAGSR RMIRFRFDSP DKTIEFPILK CWAHSEWAAP CAbEHEDICQ 
WHRMKVKSVy RHVILQVFVG LTVHTSLVCS VTI4.ITILCS KKKKK 

Seq ID NOi 125 DNA oequence 

Nucleic Acid Accession #i NM_004994.1 

Coding sequence: 20.. 2143 
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1 n 
I I 

AGACACCTCT GCCCTCACCA 
GGGCTGCTGC TTTGCTGCCC 
CCTGAGAACC AATCTCACOG 
CACTCGGGTG GCAGAQATGC 
CCAGAA6CAA CTCTCCCTGC 
GCGAACCCCA CGGTGCQGGG 
CAAGTGGCAC CACCACAACA 
GGCXK3TGATT GACGACGCCT 

CAccrrcACT cgcgtgtaca 

GCACX3GAGAC GGGTATCCCT 

tggcccxx3gc attcagggag 
gggcgtcgtg qttccaactc 
catcttcgag ggccgctcct 
ctgoigcagt accaoggcca 
gafsactctac accxsggacq 

CCAAQGCCAA TCCTACTCOQ 
CGCCACCACC OCCAACTACQ 
CTCGACGGTG ATGGGGGGCA 
GGGTAAGGAG tactcgacct 
TACC3VCCTCS3 AACTTTGACA 
TTTGTTCCTC GTGGCGGGGC 
GCCGGAGGCG CTCATGTACC 
CGACGTGAAT GGCA-TCCQQC 
AACCACCACC ACACOGCAGC 
TQTCCACCCC TCAQAOCGCC 
AGGTCCCCCC ACTGCTGGCC 
TGCCTGCAAC GTGAACATCT 
CAAGGATGGG AAQTACTGGC 
CCTTATCGCC 6ACAAGTGGC 
GCTCTCCAAG AAGCTTTTCT 



PCT/US02/12476 



TGAQCCTCTG 
CXAGACAGCG 
ACAGGCAGCT 



GCAGCCCCTG 
CCAGTCCACC 
GGCAGAOGAA 



TOCCAGMXrr 
TCACCTATTG 
TTGCCCGCGC 
GCCGGGACGC 
TCGACGGGAA 
ACGCCCATTT 
GOTTTGGAAA 
ACrCTGCCTG 
ACTACGACAC 



T6AGCTG6AT 
GGGCAGATTC 
GATCCAAAAC 
CTTCGCACTG 
AGACATCGTC 
GGACGGGCTC 
CGACGATGAC 



GTCCTGGTGC TCCTGGTGCT 
CTTGTGCTCT TCOCTGGAGA 
TACCrGTACC GCTATGGTTA 
GGGCCTGCGC TQCTGCTTCT 



CCTGCACCAC 
ACCGGGACAA 
ACTCGGCGGG 



GCGACAAGAA 
ATGAGTTOGG 
CTATQTACCG 
ACCTCTATGG 
CCAOSGCTCC 



CAAACCTTTQ AGGGCQACCT 
TACTCGGAAG ACTTGCCQCG 
TGGAGCGCGG TGACGCCGCT 
atccagtttg gtgtcgcgga 
CTGGCACACG CCTTTCCTCC 

gagttgtggt ccctgggcaa 
gcggcctgcc acttcccctt 

GQTOGCTCOG ACGGCTT6CC 



tggqaaaccc tqccagtttc cattcatctt 
ggacggtcgc tccgacggct accgctggtg 

GCTCTTGGGC TTCTGCCCGA CCOGAGCTGA 
GGAGCTQTGC GTCTTCKCCT TCACTTTCCT 
GGGCCGOGGA GATGGGCGCC TCTGGTGCGC 
QTQGGGCTTC TGCCOGGACC AAGGATACAG 
CCAOSCGCro GGCTTAGATC ATTCCTCAGT 
CTTCACIGAO GGGCCCCCCT T6CAXAAG6A 
TCCTCGCCCT GAACCTGAGC CAOGGCCTCC 



CTTCTACGGC 
TCOACGCCai 
QATTCTCIOA 
CCG03CTOCC 
TCTTCICT6G 



CCCCACAG6T OCOCCCTCAG CTGOCCCCAC 
CACTACIOTO CCTTTQAGTC CGGTGGACGA 
OGGGGAOATT GGGAACCAGC TGTATTTGTT 
GGGCAGGGGO AGCOG6CCGC AGGGCCCCTT 
CCGGAAQCTQ GACTCGQTCT TTGAGQAGCC 
GCGGCAGGTO TGG6TGTACA CAGGCGC6TC 



lAQTQGCA GGQGGMtffla GCIGCIGTTC ASOGGGCGGC GCCTCTGQAO 
GTTCQACaTG AAGGOQCAGA TGOrGGATCC CCG8AG06CC AGaJAUSTQG ACOGQATBTT 
CCCCGGGGTG CCTTTGQACA CGCAC6AOGT CTTCCAGTAC CGAGAGAAAG CCTATTTCTG 
CCAGGACOGC TTCTACTGGC GCGTGAGTTC CaSGAGrGAa TTGAACX»GG TGGACCAAGT 
GGGCTACX5TQ ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 
GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGQA AGGAGCXSKST TTOCCGGATA 
CAAACTGGTA TTCTGTTCTG GAGGAAAGGG AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 
TC»CCTTTGT TTTTT6TTGG AGTOTTTCTA ATAAACTTGQ ATTCTCTAAC CTTT 



1020 
1080 
1140 
1200 
12£0 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
186D 
1920 
1980 
2040 
2100 
3160 
2220 
2280 



45 
50 
55 
60 
65 
70 
75 
80 



I I I I I.I. 

MSUWQPLWLV LLVLGCCFAA PRQRQSTIiVL FPGOLRTNLT DRQLAEBV1.Y RYGYTRVAEM 
RQBSKSLGPA L1J:;.LQKQLSJ 
ITYWZQNYSB DLPRAVIDDi 



CTSEGRGDGR LWCATTSNFD SDKKWQFCFD QGYSIiFIiVAA HEFGHALGLD HSSVPEALMY 
PMYRFTBOPP LHKDDVNQIR HLYGPRPEPB PRPPTTTTPQ PTAPPTVCPT GFPTVHP8ER 
PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACMVNI FDAIAEIOIQ LYIiFRDOXXW 
RFSEGRGSRF QGFFLIADKW FAI.PRKI.DSV FBEPLSKKLF FFSGRQVHVY TGASVLGPRB 
LOKLGLGADV AQVTGALRSG RQKMIiIiFSGR RLNRFDVKAQ MVDPRSASEV DRMFP6VPU) 
THDVFQYREK AYPCQDRFYW RVSSRSEUIQ VDQVGYVTYD ILQCPED 

Saq ID MO: 127 DMA sequence 
Nucleic Acid Accession «■ HM 004181 
1 32-670 



GGTCGCCX3GC 



I I 
CCTAGGGAGA TCAACCCCGA 
CAGTGGCGCT TCGTGGACGT 
CCTGCCTGCG CGCTGCTGCT 



TTTTATTCTG 
TCOGGTGAAC 
CAGAGAATTC 
GGCAGCCTAA 
AATATATACC 
TGTTCTGCAG 
ACAGCTGTCC 
TATGTCTTGT 
AAGACCTTGG 



AAACTGGGAT 
TCCCCTGAAG 
GCCGTGGCAC 
TTTAACAACG 
CATGGCQCCA 
ACCGAQCGTQ 
TGCTCTGTGG 
CCCCATGCAG 
ACACGCCTTC 



ATTCCTGTGG 
TTGAGGATGG 
ACAGAGCAAA 
AOQAAGGCCa 
TGGATGGCCA 
GTTCAGAGGA 
AGCAAGGAGA 
6AGGGACTTT 
TCTAAAATGC 
CCCrCAGCCA 



GATGCTGAAC 
GCTGGGGCTG 

GCTorrTCCc 

GTTAGTCCTA 
CTTATTCACG 
AAACAQTTTC 
AAGAATGAG6 
GA7GA€»AGO 
CTTGATSQAC 
AAQGAaSCTG 
TCTGCOGTGG 
CCTCTTCCCT 
GTGAAACACA 
CTTAAGCACA 
GCTTCRGATG GTGAAGCATT 
AATGGCTACT TTGGTTTCTG 
AQAATAAATT TTGCTGATAG 



CACAATCGGA 
ATCA6TTCTG 
ATGCTTTGAA 
AXGTOGGGTA 
CCTCTATGAA 
CACCCTGCTO 
AGTCCXJCTTC 
GCTCATTTCC 
TTCAGTACTT 



CTCTGGGCTC 
AGCATQAQAA 
AAGTGTACTT 
CAGTGGCXJUi 
TTTCTGAAAC 
CCATACftGGC 
TGAAXTTCCA 
GAATGCCTTT 
CCAAGGTGTG 
CTCTCTGCAA 
TCAACATGAA 
GCTGTTCTTC 
AGC3WJAGTGC 
CrCCCCAGTG 
TCTGTAAGTT 
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21 



31 



41 



I I I I I 

MUnCVLSRIiG VAGQWRFVDV LGLEEESl^S VPAPACALLL LPPLTAQHEN FRKKQIEELK 
GQEVSPKVYF MKQTIGNSCX3 TIGLIHAVftN NQDKLGPEDG SVLKQFIjSET EKMSPEDRAK 
CFEKNEAIQA AKDAVAQEGQ CRVDOKVHFH FILFNNVDGH LYELDGRMFF PVHHGASSED 
TLLKDAAKVC RBFTBREQOB VRPSAVAIiCK AA 



PCT/US02/12476 



Seq ID NO< 129 DMA sequence 
Nucleic Acid Acceesion »: NM_C 
CVKilng sequence: 127-5385 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



AC6GAGTGTG 
CX3GCX3CTGCA 
GTCATGGAGA 
AGCCAGATGT 
6AGCTGGAGG 
TCCAACTCCA 



TCTCTGGQAC 
TCCGTGTGGA 
AC3\CCCA<3GC 
GCAGCTTCCA 



TGTTTGAGCC 



AAACTGCAGQ 
ATCCTGCAQA 
CTGGTCTTCT 
GGC3VTCATGA 
TACAGOACAC 
ATCATCCCCA 



: AGCTCACCAG 
: AGACOGACAT 
r CCTTCAAOAA 



CTTGGCAAAC 
TAAGGACTGC 
GGAGCTGCTG 
AATCACAGAG 
CCTGCOGGTC 
ACTGGAGRGC 
TCTGOACAAC 
CSACTACACT 
GAGGCCT6AG 



CGAGGCCTTC 
CACATCCGGC 
GATGGGAC6C 



AGQACTAOCC 
TCTTTGCTGT 
TCTCCTCACT 
CCTTCAATOG 
6GACA6AGGT 



CTCAGGCAAC 
CACGAGGGAC 
AGCCTTCCAC 
TGAACGGTGC 
GTCGGTGCCC 
CACCAACTAC 



GAGACCCRQA 
CX3TCTGCGGC 
CCCGTGGACC 
CTCRA GAflOA 
ATTGGATTTG 
AAGCTSAAiGG 
CTGACAOAAO 



TQCTCCTGGC AGCCTTGATC 
AGGCCCCAGT GAAGAGCTGC 
CAGACGAGAT GTTCAGGGAC 
GCCAGOGGGA GAGCATCGTG 
TTGACACCAC CCTGCGGCGC 
COQGTGAGGA GCGGCATTTT 
TGTACATCCT CATGGACTTC 
TOGGGCAGAA CCTGGCTGG6 
GCAAGTTTOT 6QACAAAGTC 



CTGCAAAAAG 
TGT6TGTGCA 
GACATTCAGC 



ACGTGTGCCA 
ACGGCCTCAA 
AGGTQCGGTC 



G&TCCGCTCC 
CACCTCCAAG 
GGGTATATAC 
GCTGCCGGAQ 
GATGGACGC6 
AGCTCGCTGC 



ATTGGCTGGC 
TATOAGGCTQ 
CAGCTGGACA 
ACXXTTGOIGC 
TCCTATA6CT 
CAGGAGOACT 
AAfiCIGOACA 
ATQTTCCAGA 



AATGCCACCT 
GGCCGCTGCC 
TOGGCGATCC 



GACGAGCTTA 
QACTGCACCT 
CTGGTGCACA 
CTCCTCCTCC 
TGCAAGGCCT 
GAAQAOCACr 



TCCGGTGTAC 
CAAGACCACA 
CTGAAGCTTA 
GGCTACTACA 
GTGGAGCTGG 
CAGCTGCTGG 
GTAAACAXCA 
TTCTCGGTCA 



CCTGCCTGCG 
ACTGTGTGTG 
AGTGTCCCCG 
GTGTGTGTGR 
GCATCGACAG 
ACTGCCACCA 
ACCCGGGCCT 
AGAAGAAGGG 



AGAAGAAGGA 

TGCcGcrccr 

GCCTGGCACT 
ACATGCTGCG 



ACAAGCTCCA 
CCATTGTGGA 
CAGAGAAGCA 
CCCTCACTGC 
TGOACGTACG 



GGAGGGCGAG 
CTACGGCGAA 
CACTTCCGGG 
GCCTGGTTGG 
CAATOGGGGC 



QAOCAGAAGG 
GGCATCATCT 
AGCrrCAACG 
ACCTGCAACT 
GACAAGCCGT 
GGCCGCTACG 
TTCCTCTGCA 



GCAACATCCA 
GTGATGTGTG 

GRGAcrrosT 

GCTCCACCXK3 



AGGGTCAQTT 



CTGCGAGGAC 
GOGCACGTGT 
GGAGGTGGTG 



GGCCCTGCTA 
TCTCCGGTGC 
lACCTG 



CACAGTGCTG 
GQTGGAACAG 
AGACCAGGAC 
GQTGCCXXrrC 



TACa«XX3AC». 
CTACX3CTCCT 
GAGGAATGCA 
GTGOGCTGCT 
GACGGCGCCC 
GGCTCCTTCT 
CTGCTGCTAT 
TGCAACCGAQ 
ATGGCCTCTQ 



AOOGAGAACC 
QAGAACCTGA 
TTCaMCAGC 
ATGGCGCCCC 
AGGGCCTTCX: 



GCTGTGACTG 
GACGT6GGCA 
CCATCTGCOA 



GTTCCGGAAT 
CTTCX3ATGCC 
CACCCACCTG 
CGTGCTGGCT 
CTACACCCAG 
C»AGCACAAC 
QCTTCACACC 
C6TGGAGCTG 
AGACAGCCCC 
TGGGTCCTTT 
TGA6CACGTG 
TCTGAAACCT 
CACCTGCGAG 
GTGCGGACRG 
CTCTCTGAGT 
TGGGGAGTGC 
CTGCGAGTAT 
AOJCrGCTCC 
TCCC CTCAGC 
CTGTGAGTGT 
GATCAACTAC 



CIGQGCCCSiA 
GGTGGCTCAT 
GCTGGAAGTA 
GTCACATGGT 
ACCACTTGGA 
GGAAGGTCAC 
CCACAGAGCT 



CAGCACTGTC 

CTGTGCCTGC 
GGGCTTTAAG 
CACGOCCaTG 
CAACAACATG 



ACGAGGTCTA 
AGCCCAATGC 
GCTCGGCCAA 
ACX3ACCTCAA 
TGGTGGAGTT 
CTGAGQAXGA 



TQACACTC6G 
CAGGCA6ATC 
CGGGAAAAAG 
GCOGGCCCTG 
GGTGGCCCCC 



TOOAGOCCAT OOAOGTOCCC GCAGGCSWrPQ OCRCCCTCGG CCXSCOGCCTG 



tacaiccccg 
qtqaaqctcc 
ttccacgtcc 
accatcatca 

TCACAOCCAC 



tggagggtga 

TGGAaCTGCA 
AGCTCAGCAA 



CCAOGTOGCC 
CTACOGCACA 
GCTGCTOTTC 
AGAAGTTGAC 
CCCTAAGTTT 
AGATGAACTQ 



0GCATCCCT6 
CAGGATGGCA 
CAGCXrrGGGG 
TCXXrrCCTGC 
GGGGCCCACC 
QACCGGAGCT 
QCCCCGCAGA 
CCCCCTTCTQ 



3 TGTCCTTI 



AGGCCTGGAA AGAGCTGCaG 
GGGGCCGCCA GGTCCGCCGT 
TGGGCCAGCC CCACTCCACC 
TCRCGAGTCA GATGTTGTCa 
ACCCCAATGC TAAGGCCGCT 



CCCTCAGTaG AQCTCACCAA CCTQIAOCCG TATTGOGACT 



AGCTGGGCTG AGCCGGCTGA 
CtGGTCaAOG ATGACAACCG 
AAGAACOGGA TGCTGCTTAT 
AAGGCXSCXSCA ACGGGGCCGG 
CAGOCCAAQA GGCCCKTGTC 
AQOSGGGAQG ACTAOGACAG 



ACCCTACAGC TCCCTGGTGT 
AATGTGGTCT 
GACCAACGGT GAGATCACAG 
ACCTATTGGG CCCATGAAGA 
TGAOAACCTT CGGGAGTCCC 
CTGGGGGCCT QAGCGGGAGG 
CATCOCCaVIC AXCCCTGACA 
TATG TACAGOGAtO 



TATGGCACCC ACCTOAGCCC 



ciccGAxanc actgagcacc 

CAACrCOCIG CACAlGSATaA 
ACA0GT6CCC GACCGCGTGC 



RTGAQATGAA 
CCTGCCGCAC 
CCTCCACGGT 
CCTACGAGGT 
AAGTCCTGGT 
AGCCCTACOG 
CCATCATCAA 
TOCCTATOGT 
AOOTTCTACG CTCTCCATGG 



GACCa«3CTG 
CTGCTATGGC 
TQACAACCCT 
CTACACGGTG 
CCTGGCC!ACC 



TAAGCACATC CTCCACCCTC 



1200 
1260 
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ISOO 
1S60 
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ACACGGOACT ACAACTCACT 
GACTACTCCA CCCTCACCTC 



CACGAGCC6C 
GOCGGTGAGC 
GACCTCXTTGC 
GGCCGAGAGC 



GTGAGGGTGT 
CAGGCTCCGC 
TGAGCCCAGA 
TCGGCTACCT 



OACCOOCTCA QAACACTCAC ACTOG&CCAC ACTQCOGAGG 



GGAAGGCTGG 
GAGCCCACTG 
CCaSCTGGTG 
OAGGCCCAAT 
GCCAGCCACC 



AGCX3CCACCX3 
GGCGGCrCCX: 
GGAACCCTTA 



TCAiCCATAGA 
TCCAGCACCC 
AGCXTCTTCCT 
TCACCCGGCA 
GCACCCACAT 
GTCCCACTAG 



GTTCTCTGCC 
GCCGCTGCAQ 
CAACATCCCC 
CTACOTGTTC 
CaTCACCATT 
CTTCACTTTG 
CTCGCTQCAG 
GGTGACCTGT 



GTCOCaVGGAT 
GCTGCAAAGC 
AGTGGATGGG 
TGTGACCCAG 
GGACCAACAG 
GOGTCCTCCC 
GCCCAGCCCA 



CA.TCTCTCA6 
T6GAGTAOCA 
AGACXTTCGGT 
CCCAGAGCCA 
GAATCCCAGG TGCACCCGCA 
AGCACTCCCA GTGCCCCAGG 
CTGAGCTGGQ AGCGGCCACG 
GAGATGGCCC AAOGAGGAGO 
AGCCGQCTOA 
AOGACCACra 
GGAGGACCCT 
GAGTACStfSCA 
CCX3ACCCTGG 
GAGXTTGTGA 
TTCTTCCAAA 
GACTCCTCTC 



GCATCACCAC 
GGGCCCAGCA 
GCCGGACACT 
CTTGACCGCA 



CCA7CCTIGC ACCCCTGGGG 

TCCTGGGAGG CATGAA66GG OCAAOGTCOQ TCCTCTGIGQ Q 
AAAOAGCTCG QAGCAGCACA AGGAOCCAGC CTTTGTTCTG C 



GCCAGftGCGC 
GGGCAGCCGT 
CACCCACACC 
CXTTGOAGGCA 
GACCACCAGC 
CCCTGCCCCA S400 
CTCAGCTACT S460 
GCTAGSTGTC 5520 
ATTTQTAACC 5580 
\ ATGGTTTTGC 5640 



4440 
4500 
4560 
4620 
4GB0 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
S340 
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25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



21 



31 



41 



51 



EVFEPLESFV 
PQTDMRPEKL 
QTAVCTRDIG 
TQDYPSVPTI. 
EAFHRIRSNIi 
raVCXJIiPEDQ 
CSBQWSGQTC 
FQCPRTSGPI. 
CHCHQQSLYT 
LKKAEEWVR 
IiLPLLALUiL 



I I I I 

SLSGTUUiRC KKAPVX8CTE CVSVDKDCAY CTDEMFRDRR 
BSSFQITBET QIDTTIiRRSQ MSPQGLRVRL RPGEBHHFEL 
SMSDDLDMLK KMQQNLAKVI. SQLTSDYTIG FOKFVSKVSV 
FSFKHVISLT EDVDBFRNKL QGERIS6NLD APEGGFOAIIi 
FSTESAFHYE ADGRNVIAGl MSRKDBRCHI. DTTGTYTQYR 
PIPAVTHYSY SWEKIBTYF PVSSLGVLQE DSSNXVEU^ 
LRTEVTSKMF OKTRTOSFHI RRGEVGIYQV QUUUiGHVDG 
SDGLKMDAQI ICDVCTCELQ KBVRSARCSF HGDFVCGQCV 
QICVCYGBQR YBGQFCBVDN 




LCWKYCACCK 



AQGEGPYSSI. 
NDONRPIGPM 
KBPHSIPIIP 



FHDIdCVAPOY 
TATUSRSIiVN 
GTAQGNSDYI 
HLGQPHSTTI 
SGKHUGYRVK 
VSCETHQEVP 
KXVLVDNFKH 
DIPIVDAQSQ 
MTTTSAAAYQ 



vhklqqtkfr qqpmaokkqd 
ytltadqdar 01vbfqesvb 
itiikeqahd wsfeqpeps 
pvggeij:.fqp qeahkelqvk 
iiropdblds sftsqmlssq 
yhiqgdsese ahlu3skvps 

SEiPGRI<AFlIV VSSTVTQLSW 
mUXIENLElE SQPYRYTVKA 



LLELQEVDSI. 
PPPHGDLGAP 
VELTNLYPYC 
AEPAETOGEI 



QHPHAKAAGS 
DVEMKVCAYG 
TAYEVCXGLV 
EAIIHLATQP 



EI.HSI.NIEHP AQTSVWEOI. 
UGSAFTLST PSAPGPLVFT 
RVDGDSPBSR LTVFGLSEHV 
I^JHPLQSEY SSITrrHTSA. 
I,STHMDQQFF QT 

Beq ID NO I 131 SNA sequence 
Nucleic Acid Accession #i BC004372 
Coding sequence: 132.. 2231 



PTRLVFSAIiG PTSUIVSHQE PSCBRPI«6X SVEYQLUlGa 
ItPHRSYVTRV RAQSQBQWGR BRBSVITISS QVHPQSPIiCP 
AIiSPOSIKlLS HEEFRRPNGO IVGYIiVTCEM AQGQOPATAF 
pyXFKVQART TEGFGPEREG IITIBSQDGQ PFPQLGSRAG 
TBPFI.VD6PT LQAQBLBAOa SLTKHVTQEF VSRTLTTSGT 



CCTCGTGCCX5 
GCCGGCCCCT 
CTCOGGACRC 
TOBGCCTGGC 
AQAAAAATGO 
ATAGCACXTTT 
GCAG6TATGG 
OTQCAQCAAA 



CATGGACAAG 



TOGCTACAGC 
GCCCACa^ATG 
GTTCATAGAA 



ATGCCTTTOA 



ACGTGAGCAG 
CCTTTTCTAC 
ACBGAATCCC 
AAGAAAATGA 
ATGAAGATTT 
AGAACCAGGA 



ACCCAGAAGC 
ATTCTACAAG 
AGGAACAGTQ 



TGCTTCAGCT 
TGGACCAATT 
ATACAGAACa 
CGGCTCCTCC 
TGTACACCCC 
TGCTACCAGT 
AGATGAAAGA 
TATCTCCaGC 
CTGGAOXaU; 
GATGACTGAT 
ACACCCTCCC 
CACAATCX»G 
GTTTGGCaAC 



CCTCTGCCAG 
CAGGG ATCCT 
TTTTOGTGGC 
TTGAATATAA 
ATCTCTOGGA 
GCCCAGATGa 
GGGCATGTGG 
GTCTACATCC 
CCACCIGAAQ 
ACCATAACTA 
AATCCTQAAQ 
A6TGAAAG6A 



GTTCGGTCCG 
CCAGCTCCTT 
AaSCAGCCTG 
CCTGCOGCTT 



CCATCCTCGT C 



ACGTCTTCAA 
GACAQACACC 
ACCATTTCAA 
TGGAACCCAA 
GTAGACAGAA 
CTCATTCACX: 
GCAACTCCTA 



AOAAAGCTCT 
TGATTCXCCG 
TCACATOCAA 
AAGATTCTAC 
TTGTTAACCQ 
ACATCTACCC 
GCAGCACTTC 
AAGACAGTCC 
ATACCATCTC 
TCAQTTTTTC 



GGGACTCTGC 
TGCAGGTGTA 
TGACCTCTGC 
QAGCATC3GGA 
GATCCACXXC 
CACCTCCCAG 
ATCAGTCACA 
TGATGGCACC 
CAGCAACCCT 
AGQAGGTTAC 
CTGGATCACC 
AGCAGGCTGG 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



CCrCCGTTOJ 
CTCX3TGCCGC 
TTCCACGTGG 

AAGG crrrcR 

TTTGAGACCT 
AACTCCATCT 
TATGACACAT 
GACCIOCOCA 
CGCTATGTCC 
ACTGATGATG 
ATCTTTTACA 
GACAGCACAG 
GAGCCAAATG 
ATTGATGATG 



CCACACCAOa GGCrmGAC CACACAAAAC 960 

GCCAITCAAA TCCXSGAAGTG CTACTTCMU 1020 

ATGOCACCAC TQCTTAXGAA GOAAACTGOA 1080 

ATOAGCATCA TGAG0AA6AA GAOACOCCAC 1140 

OTAQTACAAC GGAAOAAACA GCTACCCAOA 1200 

1U3GGATATC6 CCAAACACCC ASAGAAGACT 1260 
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CCCATTCX3AC AACAGGGACA GCTGCAGCCT CAGCTCATAC CAGCCATCCA ATGCAAGC3AA 1320 

GGACAACACC AAGCCCW5AG GACAGTTCCT GGACTGATTT CTTCAACCCA ATCTCACACC 1380 

CXaTGGGAOG AGGTCATCAA GCAGGAAGAA GGATGGATAT GGACTCCAGT CATAGTACaA 1440 

CGCTTCAGCC TACTGCAAAT CCAAACACAS GTTTGGrrGOA A£3ATTTGGAC AGGACAGGAC 1500 

CTCTTTCAAT GACAACGCAO C3U3AGTAATT CTCAGAGCTT CTCTACATCA CATGAAGGCT 1560 

TGGAAGAAGA TAAAGACCAT CCAACAACTT CIACTCTGAC ATC3VAGCRAT AGGAATGATG 1620 

TCACAGGTGG AAOAAGAGAC CCAAATCATT CTGAAGGCTC AACTACTTTA CTGGAAGGTT 1680 

ATACCTCrCA TTACCCACAC ACGAAGGAAA GC3«3QACCTT CATCCCAOTG ACCTCAGCTA 1740 

AOACTGGGTC CTTTGGAGTT ACTGCAGTTA CTGTTGGAGA TTCCAACTCT AATGTCAATC 1800 

OTTCCTTATC AGGAGACCAA GACACATTCC ACCCXaGTGG GOSGTCCCAT ACCACTCATG IBSO 

GATCTQAATC AGATGGACAC TCACATGGGA GTCAAGAAGG TGGA6CAAAC ACAACCTCTG 1920 

GTCCTATAAG GACACCCCAA ATTCCAGAAT GGCTGATCAT CTTGGCATCX: CTCTTGGCCT 1980 

TCOCTTIGAT TCTTGCAGTT TGCATTGCAG TCAACAGTCG AAGAAGGTGT GGGCAGAAGA 2040 

AAAAGCTAQT GATCAACAGT GGCAATGGAG CTGTGGAGGA CAGAAAGCCA AGTGGACTCA 2100 

ACX3GAGAGGC CAGCAAGTCT CAGGAAATGO TGCATTTGGT GAACAAGGAQ TCSTCAGAAA 2160 

CTCCAOACCA GTTTATGACA GCTQATGAGA CAAGGAACCT GCAGAATGTQ GACATGAAQA 2220 

TTGGGGTSTA ACaCCTACAC CATTATCTTG GAAAGAAACA ACCGTTGGAA ACATAACCAT 2280 

TACAGGGAGC TGGGACACTT AACAGATGCA ATGTGCTACT GATrCTTTCA TTGCGAATCT 2340 
TTTTTAGCAT AAAATTTTCT ACTCTTAAAA AAAAAAAAAA AAAAAAA 



Seq ID NOs 132 Broteln sequence: 
Protein Accession ft: AAH04372 



1 11 21 31 41 51 

I I I I I I 

MDKFWHHAAM GUrLVPLSLA QIDLNITC31F AGVPHVEKNG RYSISRTEAA DLCKAFNSTI* 
PTMAQMEKAL SIGFETCRYG FIEGHWIPR IHPNSlCaUVN NTGVyiI.TSN TSQVDTYCFN 

ASRBPEBPCr SVTDLPNAFD GPITITIVNR nGTRWQKGE YRTNPEDIYS SNPTDDDVSS 

GSSSERSSTS GGYIFYTFST VHPIPDBDSP WITDSTDRIP ATSTSSNTIS AQWEPNEEME 

DERDRHLSPS GSGIDODEDF ISSTISTTPR APDHTKQfNQD VrCQWHPSHSN PEVIJ.QmTR 

mavDostarcT ayesnvdipba hpfuhhehh beebtphsts tiqatfsstt ebtatqxbqw 

roHRMHEGYR QTPREDSHST TOTAAASART aEPKQI».TTP SPED8SHTDF FHPISBPMQR 
GEQAORIWDM DSSHSTTI<]P TANPNTQLVB DUDRTQPI-SM TTQQSllSQSg S~ 



PGVTAVTVGD SNSNVMRSI.S 6DQDTFBPSG 6SBTTHGSES DGHSHGSQBS GAHTTSOFIR 
TPQIPEWbll LASULAUVLI LAVCIAVNSR RRCSQKKKLV IKSJSIGAVED RKSS6UIQEA 
SKSQEMVHLV NKBS3ETPDQ FMTADBTRNIi QNVUMKZGV 

Seq ID HO: 133 DNA sequence 
Nucleic Acid Accession tt> MM_0Q2sa2 
Coding sequence: 150-755 



OGAGGTTCGG GTOGTGGGGC 
GCGGAGOGAA GGASCTAOOA 
AGCCOAGCCG COGCCGCOGC 
ATGATACTTC CACTGAGAAT 
TTTCTCTTCC TGAGCAAOAA 
TCCQQQCAAA ACIOTTCOSA 
GCACIQGTGA CGTCAAGCTC 
G6AGOGACAA GACCCTGAAG 
AGCCCAACGC AGGTAGCGAC 
AGTGCCCCAA GCCAGAGCTG 
TCAAAACAAA GTTTGAAGAA 
CAGGCAAAAA TGATCATGCC 
ASGAGACCAA GGAaGATGCT 
TCTCTTTCCT TTCCTTTTTT 
ATTCTTTCAT TTTEACAAGG 

Seq ID KO> 134 Protein sequence: 
Protein Accession «■ HP_002873 

1 11 21 31 41 SI 

I I I I I I 

HAAAKOTHED RDTSTEHTOE SNHDFQFEPI VSIf BQEIKT I>EEDEGELFK MRAKLFRFAS 60 
ENDLPEHKER GICDVKUUXB KEKGAIRIiLH RSDICTLKZCA HHYITPMMBIi KPHAGSDBAH 120 
VNHTHADFAD B CFKFB IJAI BFLNAEHAQK FRIKFBBCRK EIEEREKKAG SGKHDHABKV 180 
AEKLEAIiSVK BBTKBDAEEK Q 

seq ID NO: 135 DNA sequence 

Nucleic Acid Accession «> NM_000077.2 

Coding sequence: 277-742 

1 11 21 31 41 51 

I I I I I I 

CCCAAOCTGG GflCGACTTCA OOTQTGCCAC ATTCGCTAAG TGCTCGGAGT TAATAGCACC 60 

TCCTCOGAGC ACrCGCTCAC GGOGTCCCCT TGCCTGQAAA GATACCJGOSG TCCCTCCAGA 120 

GGATrrOAGG QACAGGGTOG GAGGGGGCTC TTCCOCCAGC ACCGGAGGAA GAAAGAGGAG 180 

G66CTGGCC6 GTCACCAGAO GGTGGaGOGG ACCQ06IGC6 CIC6GCG6CT GGGGAGAGGG 240 

GGAOACCAGG CAGGGGGCGO CGGGlSAGCAa CATGGAOCOG GOSGGGGGGA. 6CAGCATG6A 300 

OCCr r OGGCT QACTGGCraa GCAOGGCOGC GGCCCGGSGT OGGGTAGAOG AGGTGOSGGC 360 

GCTGCTCGAS GCGGaGGOaC TGCCC»AOGC ACCOAATAGT TAOQQTOGGA OaCOOATCCA 430 

GGTCATGATG ATGGGCAGCO COOQAOXGGC SGAGCTGCTG CTGCTCCAGG GCGGGQAGCC 480 



ggagggaaga gogggogggc gggaggcgcc ggcgccagac 60 

gtagcoqccg agaggccgcq gagccagcga cgaccgaccc 120 

cgcgccccca tggcggcogc caaqgacact catgaggacc 180 

acagacgagt ccaaccatga ccctcagttt gagccaatao 240 

attaaaacac tggaagaaga tgaagaggaa crttttaaaa 300 

tttocctcts agaacqatct cccagaaigg aaggagcgag 360 

CTQAAGCACA AGGAGAAAGG GGCCATCCGC CTCCTCATGC 420 

ATCTGTQCCA ACCACTACAT CAOGCCGATQ ATGGAOCTGA 480 

OGTGCCTGGQ TCTGGAACAC CCACGCTGAC TTCGCCGACG 540 

CTGGCCATCC GCTTCCTGAA TGCTGAGAAT GCACAGAAAT 600 

TQCAGGAAAG AGATCGAAGA GAGAGAAAAG AAAGCAGGAT 660 

GAAAAAGTGG CGGAAAAGCT AGAAGCTCTC TC6GTQAAGG 720 

OAGGAGAAGC AATAAATOGT CTTATTTTAI TTTCTTTTCC 780 

TAAAAAATTT TACCCTGCCC CTCTTTTTCG GTTTGTTTTT 840 
GAOGTTKIAT AAAOAACIQA ACTC 
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CMCTGCSCC GACCCCGCCR CTCTCftCCCn ACCOGTGCAC QAOBCTGCCC GOOAGGGCTT 540 

CCIGOACAO: CraGTOGTGC TQCACOGGGC CGGGGC60GG CTQGAOGTGC GCX3ATGCCTG 600 

GGGCOGTCTG CCXX?rGGACC TGGerGAGOA GCTQGaCCAT CaOaKIGTCG CaCGGTACCT 660 

GCXsGGOGGCT GOGGQGGGCA CCAGAGGCAG TAACCATGCC CGCATAGAT6 CCGCGGAAGG 720 

TCXXTCAGAC ATCCCOCSATT GAAAGAACCA GflGAGGCTCT GAGAAACCTC GGGAAACTTA 780 

GATCATCAGT CACXX3AAGGT CXTTACAGGGC CACAACTOCC CCOGCCACAA CCCACXICCGC 840 

TTTCX3TAGTT TTCATTTAGA AAATAGAGCT TTTAAAAATG TCCTGCCTTT TAAOGTAGAT 900 

ATATGCCTTC CCTCACTACC GTAAATGTCC ATTTATATCA TTTTTTATAT ATTCTTATAA 960 

AAATGTAAAA AAGAAAAACA CCGCTTCTGC CTTTTCACTG T6TTGGAGTT TTCTGGAQTG 1020 

.AGCACTCACG CCCTAAGOSC ACATTCATGT GGGCATTTCT TGOSAGCCTC GCAGCCTCCX; 1080 

GAAGCTGTCS ACTTCATGAC AAaCATTTTG TGAACrAGGG AAGCTCAGG6 GGGTTACTGG 1140 

3 CAGAACCAAA 6CTCAAATAA AAATAAAATA UOO 



I t I I I I 

MEPAAGSSME PSAOWIATAA ARGRVEEVRA LIiEAOAIiPNA PNOTGRRPIQ VMmMSSARVA 
EUiIit£GAEP NCADPATLTR PVHDAAREGF LDTbWLHRA GARliOVHDRW GRLPVDLAEB 
USHROVARYIi RAAAQGTROS HHARIDAAEQ PSDIPO 



Seq ID NO: 137 DNA sequence 

Nucleic Acid Accession ft: NM_05ai96.1 

Oodles sequence: 104-421 



1 11 21 31 41 51 

I 11 I I I 

TGTOTGGGGG TCTGCTTGGC GGTGAGGGQG CTCTACACAA GCTTCCTTTC CGTCATGCXS3 60 

GCCCCCACCC TGGCTCTGAC CATTCTOTTC TCTCTGGCAG GTCATGRTGA TGGGCAGCGC 130 

CCGAGTGGCXS GAGCTGCTGC TGCTCX3VCGG CGCGC5AGCCC AACTGCGCXX3 ACCCCGCOVC 180 

TCTCACCXX3A CCCGTGCACG ACGCTGCCCG GGAGGGCTTC CTGGACACGC TGGTGGTGCT 240 

GCACCGGGCC GGGGCGOGGC TGGACX5TGCG CGATGCCTQQ GGCCGTCTGC CCXSTCGRCCT 300 

GGCTGAGGAG CTGGGCCATC GCGATGTOGC ACGGTACCTG CGCGCGGCTQ CX3GGGGGCAC 360 

CAGAGGCR0T RACCATGCCC GCATAGATGC CGOGGAAGGT CCCTCAGACA TCCCCGATTQ 430 

AAAQAACCAG AGAGGCTCTG ASAAACCTCQ GGAAACTTAG ATCATCAGTC ACC GAAO GTC 480 

CTACAGGGCC ACAACTGCXX: CCGCCACAAC CCACCXMGCT TTOGTAGTrr TCATTTAGAA 540 

AATAGAGCTT TTAAAAATGT CCTGCCTTTT AAOGTAGATA TAAGCCTTCC CCCACTACCG 600 

TAAATGTCCA TTTATATCAT TTTTTATATA TTCTTATAAA AATOTAAMA MAAAAACAC 660 

OGCTTCTGCC TTTTCACTGT GTTGGaaTTT TCTGOAGIGA GCACTCACGC CCTAAGG6CA 720 

CATTCATGTG GGCATTTCTT QCOROCCTCa CAOCCTCCSKJ AAI3CTGTCQA CTTCATOACA 7B0 

AGCaTTTTGT GAACTAGGGA AGCTCAGGG6 G6TTACTSGC TTCTCTTOAG TCACACTOCT 840 
AflCRAATGGC AGAACCRAAQ CTCAAATAAA AATAAAATAA TTTTCATTCA TTCACTC 



Seq ID NO! 138 Protein sequence: 
Protein Accession ft: NP_47ei03.1 

1 11 21 31 41 51 

MMM6GARVAE UiUJIGAEFH CADPATLTRP VHDAARBGPIi DTIiWX^iMO ARIJ>VRDAW3 60 
RI>PVDIiAEEIj QHRDVARYliR AAAGGTROSH HASIDAAEGP SDIPD 

Seq ID NO: 139 DNA sequence 

Nucleic Acid Accession »: MM_058197.1 

Coding sequence: 272-684 

1 11 31 31 41 SI 

111)11 

CCCRACCTGG GGOGACTTCa. GGTGTOCCAC ATTCQCTAAG TGCTOGQAGT TAATAGtavCC 60 

TCCTCCGAGC ACTOGCTCAC GGCGmXCCT TGCCTGGAAA GATACCGCGG TCCCTCCAGA 120 

GGATTTOAGG GACAGGGTCG GAGGGQGCTC TTCCS3CCA0C ACCGGAGGAA 6AAAGAGGA6 180 

GGGCTGGCTG GTCACCAGAG GGTGGGGCGG ACCGCX3TGCG CTCGGCGGCT GCGGAGAGGG 240 

GGAOAGCAGG CAGCGGGOGG CGGGGAGCAG CATGGAGCOG GCGGCGGGGA GCAGCATGGA 300 

GCOGGCOGGG GGGAQCftOCA TGGAGCCTTC GGCTGACTGG CTGGCCACGG CCGCGGCCCQ 360 

GGGTOGGGTA GAGOACGrGC GGGOGCTQCT GQAGGCGGGG GOSCTGCCCA ACX5CACCGAA 420 

TAGTTAOGOT CGGAGGCGQA TCXnGGTGGG TAGAAGGTCT GCAGCGGGAG CAGGGGATGG 480 

OGGOaSRCTC TaQASCaCGA ASTTTGCAOG GGAATTGGAA TCAGGTAGCG CTTCGATTCT 540 

CCGGAAAAAG GGQAOOCTTC CrGGGOAOTT TTCAGAAGGG GTTTGTAATC ACA6ACCTCC 600 

TCCTQSOQAC GCCXrTGGGGG CTTGGGAAAC CAAGOAAGAG GAATGAGGAG CCACX3CGCX3T 660 

ACAGATCTCT CGAATGCTQA GAAQATCTQA AQGGGGGAAC ATATTTGTAT TAGATQGAAG 720 

TGATGATGAT GGGCAGCGCC C6AGTGGCGG AGCTGCTGCT GCTCCAOSGC GOGGAGCCCA 780 

ACTQOGCCX3A CCCCGCCACT CTCACXXSSAC CCX3TGCAOGA CGCTGCCCX3G GAGGGCTTCC 840 

TGGACACGCT GGTGQTGCTG CACCGGQCCG GGGOQOGGCT GGACGTQCGC GATGCCTGGG 900 

GCC6TCTGCC OGIGGACCrG GCTQAGGAGC TGGGCCATCG CQATGTCGCA CGGTACCTGC 960 

OGSCOGCTQC GGGGQaCACC AQAOGCAQTA ACCATGCCOG CAXAGATGCC GCGGAAGGTC 1020 

CCICAGACAT CCCOSATTGA AAGAACCAGA GAQGCTCTGA GAAACCTCXSG GAACTTAGAT 1080 

CRTCAGTCAC OGAAGGTCXTT ACAGQGCCAC AACTGCCCXIC GCCACAACCC ACXrCCGCTTT 1140 

OGTAGrrTTC ATTTAGAAAA TAGAGCTTTT AAAAATGTCC TGCCTTTTAA CGTAGATATA 1200 

TGCCTTCCCC CACTACCGTA AATCTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1260 

TCTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TGGAGTrTTC TGGAGTGAGC 1320 

ACTCACXSCCC TAAGCX3CACA TTCATGTGGG CATTTCTTGC GAGCCTCGCA GCCTCCGGAA 1380 

GCTGTOGACT TCATGACAAG CATTTTGTGA ACTAGOGAAG CTCAGGGGGG TTACTQGCTT 1440 

CTCTTGAQTC ACACTGCTAG CAAATO6CA0 AACCAAAGCT CAAATAAAAA TAAAATAATT 1500 
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I I I I I I 

MEPAAGSSME PAAGSSMEPS ADHLATAAAR GRVEEVRALI. EAGALFNAFN SYGRBPXQVG 60 
RRSAAGAGDQ GRLWHTKFAO ELESGSASIL RKKSHLPSEP SEOVCMHRPP P<mALGAWET 120 
10 KEEE 

Sag ID NO: 141 DMA sequence 

Nucleic Acid Accession #> NM_05ei95.1 

Coding sequence: 163-684 

1 11 31 31 41 51 

I I I I I I 

CCTCCCTACG GOCGCCrCOG GCAGCCXTTTC CCGCGTGOGC AGGGCTCAGA GCCKTTCCGA 60 

^ - Q&TCTTGGAQ GTCCGGQTGa GAGTGGQGGT GGGGTGGGGG TGGGGGTGAA GGTGGGGGGC 120 

20 GGGOSOGCTC AGGGAAG6CQ <3ST6aGOGCC TGOGGGGOGG AOATCGGCAG GGGGCGGTGC 180 

GTGGSTCCCA GTCTGCaWJTT AAGGGGGCAQ GAGTGGCGCT GCTCACCTCT GGTGCCAAAG 240 

GGOGGCGCAG OQGCTGCCGA GCTCGGCCCT GGAGGCGGOS AQAACATGGT GCGCAGGTTC 300 

TTGGTGACCC TCCGGATTCG GCGCXKSSTGC GGCCCGCCGC GAGTQAGGGT TTTCGTGGTT 360 

_ _ CACATCCCGC GGCTCACGGG GGAGTGGGCA GCGCC3VGGGG CGCOXSCCGC TGTGGCCCTC 420 

25 GTGCTGATQC TACTGAGGAG CCAGCGTCTA GGGCAGCAGC CGCTTCCTAG AAGACCAGGT 480 

CATGATGATG GGCAGCGCCC GAGTGGCGGA GCTGCTGCTG CTCCACGGCG CGGAGCCCRA 540 

CTGCGCCGAC CCCGCC3VCTC TCACCCGACC CGTGCACGAC GCTGCCCGGG AGGGCTTCCT 600 

GGACACGCTG GTGGTGCTGC ACCGGGCCXXJ GGCGCQGCTG QACGTGCGCG ATGCCTGGGG 660 

COGTCTGCCC QTOGACCTOQ CXGAGGAGCT GGGCCATCGC GATGTCGCAC GGTACCTGOS 720 

30 oooGQCTOCo aaaGGe»ce» gaoqcagtaa ccatgcccgc atagatgccg cggaaggtcc 7 so 

CTCAGACATC CCCGATTGAA AOAACCAGAG AGGCTCTGAG AAACCTCGGG AAACTTAGAT 840 
CATCAGTCAC CGAAG6TCCT ACAGGGCXAC AACTGCCCCC GCCACAACCC ACCCCGCTTT 900 
CGTAGTTTTC ATTTAGAAAA TAGAGCTTTT AAAAATGTCX: TGCCTTTTAA OGTAGATATA 960 
TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1020 
35 TGTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TOGAQTTTTC TGOAGTOAOC 1080 
ACTCACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTOGCA GCCTCCQGAA 1140 
GCTGTCGACT TCATGACAAG CATTTTGTGA ACTAGGGAAG CTCAGGGGGG TTACTGGCTT 1200 
CTCTTGAQTC ACACTGCTAG CAAATGGCRG AACCAAAGCT CAAATAAAAA TAAAATAATT 1260 
TTCATTCATT CACTC 

40 
45 
50 
55 



I I I I 1 1 

MGR6RCVOPS IiQIiRGQEHRC SPLVPRGQAA AAELGPCG6B NMVBHFLVTIi RIRBAOGPFR 
VRVFWHIPR LTGEWAAPGA PAAVALVLML LRSQRLGQQP LPRRPGEDDQ QRPSGGAAAA 
PSRGAQLRRP RHSHPTRARR CPGGLFGHAG GAAFGRGAAG B 

Seq ID HOt 143 taHA sequence 
Nucleic Acid Accession «> NM_018131 
Coding sequences 412.. 1107 



1 11 21 31 41 51 

I I I I i I 

GAAATTGCAC ACTTAAAGAC ATCAGTGGAT GAAATCACAA GTGGGAAAGG AAAGCTGACT 60 

GATAAAGA6A GACAGAGACT TTTGGAGAAA ATTCGAGTOC TTGAGGCTGA GAAGGAGAAG 120 

60 AAISCTTATC AACICACAGA GAA6GACAAA GAAATACAGC GACTGAGAGA CCAACTGAAG 180 

addWSATATA GTACIAOCGC ATTGcrCQAA aUSCnSOAACI AOACAAOSAQ AGAAfiG AGAA 240 

AOGAGGOAGC AaOTGTTOAA AGCCTTATCT GAAGAISU\Aa AOSIATT8AA ACAACAOTTS 300 

TCTGCTGC31A CCTCACQAAT TGCTGAACTT GAAAGCAAAA CCAATACACT COGTTTATCA 360 

CAGACTCTGG CTCCAAACTG CTTCAACTCA TCAATAAATA ATATTCATOA AATGOAAATA 420 

65 CAGCTGAAAG ATGCrCTGGA GAAAAATCAG CAOTGGCTCG TGTATGATCA GCSWSCGGGAA 480 

GTCTATGTAA AAGGACTTTT AGCAAAGATG TTTGAGTTGG AAAAGAAAAC GGAAACAGCT 540 

QCTCATTCAC TCCCACAOCA GACAAAAAAG CCTGAATCAQ AAGGTTATCT TCAAOAAGAG 600 

AAGCaCJUUVr OTTACauVCaA TCTCTTGOCA AGTGCAAAAA AAGATCTTGA GQTTGAACGA 660 

CAAACX»TAA CTCAGCTGAG TTTTOAACTO AGTGAATTTC GAAGAAAATA TGAAGAAAOC 720 

70 CAAAAAGAAG TTCACAATTT AAATCAGCTG TTCXATTCAC AAA6AAGGGC AGATCTGCAA 780 

CATCTOOAAS ATGATAGGCA TAAAACAOAO AAOATACAAA AACTCAGGGA AGAGAATGAT 840 

ATTGCTAGGG GAAAACITGA AGAAGAGAAG AAQAGATCGB AAGAQCTCTT ATCTCAGGTC 900 

CAGTCTCTTT ACACATCTCT GCTAAAGCAG CAAGAAOAAC AAACAAGGGT AGCTCTGTTG 960 

OAACAACAGA TOCAGGCATG TACTTTAGAC TTTGAAAATG AAAAACTCGA CCGTCAACAT 1020 

75 GTGCAGCATC AATTGGATGT AATTCTTAAG GAGCTCCGAA AAGCAAGAAA AAATAACACA 1080 

GTTGGAATCC TTGAAACAGC TTCATGAGTT TGCCATCACA GAGCCATTAG TCACTTTCCA 1140 

AGGAOAGACr GAAAACAOAQ AAAAAGTTGC OQCCTCACXa. AAAAGTCXZCA CTGCTGCACT 1200 

CAAT(3GAAQC CTOGTGaAAT GTCOCAAGTG CAATATACAO TATCCAGCCA CTQAGCATOG 1260 

OOATCTGCTT GTOCATOTGa AATACTGTTC AAAGTAGCAA AATAAGTATT TGTTrTGATA 1320 

80 TXAAAAGATT CAATACTQTA TTTTCTGTTA GCTTGTGGGC ATTTTGAATT ATATATTTCa 1380 

CaTTTTGCAT AAAACTGCXTP ATCTACCTTT GACACTCCAG CATGCTAGTG AATCATGTAT 1440 

CTTTTAGQCT GCTOTGCATT TCTCTTGGCA GTOATACXTTC CCTGACATGG TTCATCATCA ISOO 

GGCTOCAATG ACAGAATGTO GTGAGCAGCX: TCTACTGAGA TACTAACATT TTGCaCTOTC IS 60 

RAAATACTTG GTGAGGAAAA GATAGCTCAG GTTATTGCTA ATGGGTTAAT GCACCAGCAA 1620 

85 GCAAAATATT TTATGTTTCG QGGGTTTTGA AAAATCAAAfi AXAATTAACC AAGGATCTTA 1680 

ACtOTGTTCG CATTTTTTAT CCAAGCACTr AQAAAACCTA CAATCCTAAT TTTGATGTCC 1740 

AXTOTTAAGA GOrGGTOATA GAIACTATTT TTTTTTCATA TTGTATAGCG GTrATTAGAA 1800 
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TCCCCAACTC TGTTCTQOGC AOGAAACAOT ATCTQTrrG& OGCAXAKICT TAAQTGGCCA. 1920 
dVCACAATQT TTTCTCTTAT OTTATCTOGC ASIAKCTGTA ACITGAATTA CIVTTAGC31CA 1980 
TTCTGCTTAG CTAAAATTGT TAflAATAAAC TTTAATAMC CCATOTKGCC CTCTCATTTG 2040 
ATTGACAGTA TTTTAGTTAT TTTTGGCATT CTTAAAGCTG GQCAATGTAA TGATCAOATC 2100 
TTTOTTT6TC TGAACAGGTA TTTTTATACA TGCTTTTTGT AAACCAAAAA CTTTTAAATT 2160 
TCTTOVaQXT TECTAACATG CTTACXACTG GGCTACK3TA AATGMShAAA 6AATAAAATT 2220 
ATTTAATQTT TT 



Coding sequence: 



1 11 21 31 41 SI 

I I I I I I 

^- CCQCCMSKn •tOAATCQGGG GACCCGTTGG CaVGAGGTGGC GGGGGCGGCA TGGGTGCXCC 60 

30 QRCSITGCCC CCTGCCTGOC AGCCXrnTCT CRAGGACCAC CGCRTCTCTA CATTCAAGAA 120 

CTGGCCCTTC TTGGAGGGCT OCGCCTGCRC CCCGGRGCGG ATOGCOGAGG CIGGCTTCAT 180 

CCACIGCCCC ACTGAGAACG AGCCAGACTT GGCCCAGTGT TTCTTCTGCT TCAAGGAGCT 240 

GGAAGGCTGG GAGCCAGATG ACGACCCCRT AGAGGAACAT AAAAAGCATT OGTCCGGTTG 300 

CGCTTTCCrr TCTGTCAAGA AGCAGTTTGA AGAATTAACC CTTGSTGAAT TTTTGAAACT 360 

35 GGACAQAGAA AGAGCXaVAGA ACAAAATTGC AAAGGAAACC AACAATAAGA AGAAAGAATT 420 

TGAGGAAACT GCX3AAGAAAG TGOSCCGTGC CATCQAGCM CTGGCTGCCA TGGATTGAGG 480 . 

CCTCIGGCCXS GAGCTGCCTG GTCCCAGAGT GGCTGCACCA CTTCCAGGGT TTATTCCXrrG S40 

GTGCCACCAQ CCTTCCTGTG GGCCXXTTTAG CAATGTCTTA GGAAAGGAGA TCAACATTTT 600 

CAAATTAGAT GTTTCAACTG TGCTCX:tGTT TTGTCTTGAA AGTGGCACCA GAGGTGCTTC G60 

40 TGCCTGTGCA GCGGGTGCTG CTGGTAAC31G TGGCTGCTTC TCTCTCTCTC TCTCTTTTTT 720 

GGGGGCTCAT TTTTGCTCTT TTGATTCCOG GGCTTACCAG GTGRGRAGTG AGGGRGGftAG 780 

AAGGCAGTGT CCCTTTTGCT AGAeCTGACA GCTTTGTTOO CGTGGGCAGA GCCTTCCACa S40 

GXGAATGTGT CTGGACCTC3V TGTTGTTGAQ GCTOTOICAS TCCTGAiSIGT GGACTTGGCA 900 

GGTGCCTGTT GAATCTGAGC- TQCftOOTTCX: TTATCTOTCA CKSTKSTGCC TCCTCBGftGG 960 

45 ACAGTTTTTT TGTTaTTGTG TTTTTTTGTT ■ rrmTl ' lVX ' GGTAGATGCA TGACTTGTGT 1020 

GTGRTGAGAG AATGGAOACA GAGTCCCTOG CTCCTCTACT GTTTAACAAC ATGGCTTTCT 1080 

TATTTTGTTT GAATTGITAA TTCACAGAAT AGCACAAACT ACAATTAAAA CTAAOCACAA 1140 

AGCCATTCTA AGTCATIGGQ QAAAOGGGGT GAACTTCAGG TGGATGAGGA GACAGAATAG 1200 

AGTGATTUSGA AGCGTCTGGC AGATACTCCT TTTGCCACTG CTGTGTGATT AGACAGGCCC 1260 

50 AGTGAisCCGC GGGGCACATG CTGGCCGCTC CTCCCTCAGA AAAAGGCAGT GGCCTAAATC 1320 

CTTTTTAAAT GACTTY3GCTC GAT6CTGTQG GGGACTGGCX GGGCTGCTGC AGGCCGTGTG 1380 

TCTOTCAGCC CAACCTTCAC ATCTOTCACG TTCTCaUSlC GGGGQAGAGA OGCAGTOOGC 1440 

CCAGQTCOCC GCTTTCTTTO GAGGCAaCAG CTCCX3SCAGQ GCTGAAGTCT OGCGrAAGAT ISOO 

GATGGATTTG ATTCSOCCCTC CTCCCTGTCA TAGAGCTGCA GGGTGOATXG TTACAGCTTC 1560 

60 

X 11 21 31 41 51 

I I I I I I 

MQAPTLPPAH Q7PLKDKRIS TFKimPFLEX} CACTPERMAB AGFIHCPTEH BPDLAQCPFC 
FKEIiEGHEPP DDPIEERKKH SSGCAFLSVK KQFBELTLGE FIJOiDRBRAK NKIAKEimiX 
65 KKEFEBTAKK VRRAIEQI<AA MD 



Seq ID NO: 147 DMA sequence 

Nucleic Acid Accession «■ NM_014176.1 

Coding sequence: 127-720 



1 11 21 31 41 51 • 

I I I I I I 

GCGOGCAGCG CrGGTACCCC GTTGGTCCGC GCX5TTGCTGC GTTGTGAGGG GTGTCAGCTC 60 

AGTGCRTCCC AGGCAGCTCT TAGTGTGGAG CAGTGAACTG TGTGTGGTTC CTTCTACTTG 120 

75 GGGATCATGC AGAGAGCTTC AOGTCTGAAG AGAGAGCTGC ACATGTTAOC CACAGAOCCA ISO 

CCCOCAGGCA TCACATGTTG GCAAGATAAA GACCAAATOG ATGACCTGCS AGCTCAAATA 240 

TTAGGTGGAO CC31ACACACC TTATQAGAAA GGTGTTTTTA AGCTAGAAGT TATCATTCCT 300 

OAGAGGTACC CATTTGAACC TCXTTCAGATC CX3RTTTCTCA CTCCAATTTA TC3VTCCAAAC 360 

ATTGATTCTG CTGGAAGGAT TTGTCTGGAT GTTCTCAAAT TGCCACCAAA AGGTGCTTGG 420 

80 AGACCATCCC TCAACATOGC AACrGTGTTG ACCTCTATTC AQCTGCTCAT QTCAGAACCC 480 

• AACCCTGATG ACCCGCTCAT GGCTGACATA TCCTCAGAAT TTAAATATAA TAAGCX31GCC 540 

TTCCTCAAGA ATGCCAOACA GTGGACAGAO AAGCATGCAA GACAGAAACA AAAQQCIQAT 600 

GAGGAAGAGA TGCTTGATAA TCTACCAGRG GCTGGTGACT CCAGRGTACA CAAC TCAAC A 6 SO 

CAGAAAAGQA AGGCCAGTCA GCTAGTAGGC ATAGAAAAGA AAtTTC»XCC TOATXSTTTAG 720 

o5 GGGACrrGTC CTGGTTCATC TTAGTTAATG TOTTCTTTGC CAAQGiraXC TAAGTTGCCT 780 
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10 



I I 1 I I 1 

MQRASRLICRE LHMLATEPPP GITCWQDKDQ MDDLRAQILG GANTPYEKGV FKIiWIIPER 
YPFEPPQIRF LTPIVHPNID SAGRICLDVL K1.PPKGAWRP SUIIATVbTS IQLUISEPMP 
ODPUtADISS BFKXNKPAFL KNARQWTEKH ARQKQKADEE EMLDtOjPEAG DSRVHKSTQK 



Seq ID NOi 149 1 
Nucleic Acid Accession «: NM_003812 
Coding sequence: 224-2722 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ACGCGGCCCC 
GCTTCTCGTC 
QGCTGCTGCG 



: CXSAQCOSGOB TGAC06GCTC OGCCOGOOGC CGCCCaX»0 CTAGCCOGGC 180 
3 GCCACAOGGA 
CAGCCXKXTCX: 

GCCGGCTCGG TGCCTGCCAG CGCCCCGGCC CGCACGCCGC CCTGCCGCCT 360 
CTTCTCCTGC 
CCCAGCX3CTC 
GACAATACAT 
GAAATCACAC 
CAOGTTCTTG 



TCGCGGGCTG 
TGCCTGCCAG 
TGCCTCCGCT 
CGCATTGGAA 
TGCAACAGAA 
TGCCTTCAAO 



ACTCATATAT 



CATACTGAAC 



AGCAGTGAAT 
TAATOATCAC 
AAAOTCCGTQ 
CCTGGTGGCT 
GCAGATQCTC 
GCACCTCATC 



AATG6TTTGT 
TCTAAGGGTO 
GTQQCTCTGT 
AT6ATAGAGC 
CAGAAAACCT 
GACCAGTGGC 
CCATCACGTG 
AAAACGTATA 
GTCAACCTTG 
GTAGAGACCT 
CATGAGTTCT 



TCCAGATTGA 
TGTCTTCTGA 
GAGAGCACTG 



CACTAGAGCT 



CCTTTCTCTC 
GTATATTTGA 
AGAAGCATCO 
TGGATTCTAT 
GGACTQAGAA 
CAAAATACCG 
CATrrCACTA 



TTATGTGGAG 
TTACTACCAT 
TOGACTTCAT 
GGTTCATGAT 
GTATTCTAAG 



AGAAATGAAA 
CTCTTCTCAT 
TTACAAGGAO 



TCCCGGCCCC 
GAAAAAAATT 
AATATCAGTT 
TACATCAACC 
CAAAftACATA 
TCCAAATTCA 
ATTCACTACG 
GGAAGCATCA 
GGCATGTTTQ 
GAGAAAAGCA 
CAAATGAAGA 
TGGTTGAAAA 
TATTTGGAAC 



GCGCCTGGGG 
TGGGAGTCCT 
ACAGCAATGC 
AAGACTCGSA 



GCAGOGCATT 



GACATCACCA 
AAOCM3CATG 
AGTCTGAGTT 



TTCTTGACCr 
AAAATGGGAA 
GAGGCGTCAA 
AAGATGATAC 
CAGGTC6ACC 
ATCTCACTAT 
OAAGQAAGAG 
TTATGATTGT 
ACAACTTTGC 
CCAGGGTTGT 
CCAACCCTGT 
CTGATGCTGT 
ACTTTGOAGG 



GTCCCATTCT 



AAATGGATAC 
ATTATGCTGT 
TAACAATACC 
GTGTGATATT 
GCAAOACGGA 



TOTGACTGCA 
CGAAAATTTT 
GCCTGCCTTT 
GTGGAAGCTG 
AAGAAA TGTT 
TCRTGTCTTT 
ACTGAATATP 



CAGAATCCTG 
CAAAGTGCAG 
TCAACAGGCC 
GGGAGGAGTG 
CCCTCTCCAA 
TTCAGCCACG 



CCTTGGAATC CAATGQQAAC CTTCTAGCAG 
6GGTQ8CTGC ATCATGGAG6 AAACAG6GGT 
CATTTTQGAG TATAGAGACT TTTTACAGAG 
AACAAAGCTA TTTGAGCCCA CGGAATGTGG 
TGATTGTGGT TTTCATGTGG AATGCTATGG 
CGGGGCTCAC TGCAGCGACG GGCCCTGCTG 
AGGGTATGAA TGCCGGGATG CTGTGAACGA 
TGCCCACCAA ATCTTCATAA 



CTATGAAAAG 
GTGGATTCAG 
TCGAGCTCCA 
AGGCCGGGTG 
CTATGTAGAA 
ACAAATTCAA 



CTGAATACAG 
T6CAGCRAAC 
CGTATTGGTC 
ATTGACTGCA 



ATGATGTGTT 
AACTTCAGGG 
GTGGTGCCCA 
CATGTGGCCC 
TGAGCA6CTG 
ATGAAOCCAC 



GAAGGGAAAC TGCGGGAAG6 
CTGTGGATTC TTACTCTGTA 
TGAGATCATT CCAACTTCCT 
TGTAGTTTTA GATGATGATA 
GTCTATGATG TGTTTAGATC 
TCCACTOQAT TCCAAGGGTA 
CTQCATTTOT GATTTCACCT 
CCTTCACCCC GCCAAGGATG 



ACAAOTTCTG 
ATGGAGACC6 
CCAATCTTAC 
TCTACCATCA 



TATTGTCCTT GGGGGCACAQ 6CTGGGGA1T TAAAAAXCXC AAGAAGAGAA GGTTCGKTCC 
TACTCAGCAA GGCCCCATCT GAATCAGCTG COCTGGATCG ACACCGCCTT GCACTGTTGO 
AITCrGGGTA TGACATACTC GCAGCAGTGT TACTGGAACT ATTAAGTTTQ TAAACAAAAC 
CTTTGQGTGG TAATGACTAC GGAGCTAAAG TTGGGGTQAC AAGGATGGGG TAAAAGAAAA 
CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACCTGTCAGT AAACGGGGGA 
GGGGGCAAAA GACCATQCTA TAAAAAGAAC TGTTCCAGAA TCTTTTTTTT TCCCTAATGG 



1200 
1260 
1320 
13ao 
1440 
1500 

iseo 

1620 



1920 
1980 
2040 

2160 
2220 
2280 
2340 
2400 

2520 
2580 
2640 



MRPPGSSSRQ PPLAGCSU^G ASCGPQRGPA G 



SPSAHGAAAP SABlmHEXAE 



I.PMAVAQVI.S 
DDFLORGGGA 
SDGPCCmiTS 



KTIiAGQYSKQ 
TYKKHRSSHA 
EFSKYRQRIK 
OSLAQKLGIQ 



KMLGVLADED 
KBNKAVHLAQ 
SIRSVKDSKV 



HTKNFAKSW 



WEPSSRKPKC 
EPTECGNQYV 
RDAVMECDIT 



ALSTCNGLHG 
QWPFI.SBLQH 
NIiVDSIVKEQ 
RVTFHYKRaS 



TPPCRLLLVL 
ISYSHAMQKB 
KFIUJLILNH 
MFEDDTFVYM 



UJ.FPLAASS 
ITLPSRilYY 
GLLSSDYVBI 
lEPLBLVHDE 



ETWTBKDQID 
TRGVGVNEYG 
KFSKCSII.EY 
KCSLSNGAHC 
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WmJLTRAPR IGQIiQGEIIP TSFYHQGRVI 
LDRKCLQIQA LNMSSCPXJ3S KGKVCSGHGV 
KDEGPKGPSA TNLIIGSIAG AILVAAIVLG 



Seq ID NO: 151 DNA s 
Nucleic Acid Accession SM_02391S 
Coding sequence s 2S0-1326 



PCT/US02/12476 



I 



TCAAAGCTTA 
GTGAATGGAC 
CXKACGCCTC 
AACT6AAGAA 
CAAOAaAGTC 
AATGAATTTG 
TTGCTGAATG 
TTCTATCTCA 
ATAGTCCaTG 
TCAGTTTTGT 
GATCGCTATC 



TTCTTAATTA 
AGCCAGCCAC 
AATOGTCCCC 
TGGGGTTCAA 
ACAATTCAGG 



rTTACC 



GAGACAAGAA 
CACAATGAAA 
AAGTGTTTCC 
CTTOACGCTT 
CAACAGQAQC 
CTTGCCGGTO 
GTTTAQCAOT GTGQATCTTC 



31 
I 

AGAAAATCXA 
ACCTGTTTCA 
GAAATCAAAC 
TQACACGCAT 
GCAAAATTAC 



CTTCCCTGCC 



CAGGAATAAC 
CTTTGCTTAC 
CAAATAACX^A 



51 
I 

GACCTTAGTT 
ACCGTATGAG 
CTATGCTGAA 
AGTGCATCAC 
GCTGCACXK3C 



ATCXTTGACAA 



ATGCAGGATT 
TTTATGCAAA 
TGAAGGTGGT 
TATCrOTTTG 
ATGGTCAGCC 



GGTTGCAGAC 
TGGACCTTGG 
CATGTATACT 
CAAGCCATTT 
TGTTTGGGTO 



CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCAAGT 



TTATATTTGT GGCAAGCATC 420 



GTGCTGGTGA 
AGGCAATTCA 
GTGGCTGTGr 
AGTCACTTAG 
ATTACACTTT 
TGTAGGTCAT 
ATCAQATCAC 
GTGTAGGCCT 
TTCATTATCC 



TTCTGATCGG 
TAAGTCAGTC 
TrTTTACCTG 
ACAGGCTTTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



TACOGCAOTC 
ATGTTACATA 
AAGCCGAAAG 
CTTTCTACCA 
AGATGAATCT 
QTGTAATGTT 
OCTGTTCAAA 
QAOAAQATCO 



GGGGACTCTC 
ATCATG6CTG 
AATATCCATG 



CGCTGACATT 
TTATTCTCTG 
TCCTTGGGCT 
GGATGTACAG 

ACTGCTCAAA 



GATAAGCATT 
CATAACCTTC 
QCCAAACATC 
ACTTAAAAGT 



AA 



ACCTATQIOA ACAGCIGCTT QTTiaTQGCC 900 

GCCATATCav GOTACATCCA CAAATCCM3C 960 

OGAAAACATA ACX3M3AGCAT CAGGGTTGTT 1020 

TATCACTTGT GCAGAATTCC TTTTACTTTT 1080 

GCACAAAAAA TCCTATATTA CTGCAAAGAA 1140 

TGCCTGGATC CAATAATTTA CTTTTTCATG 1200 

AAATCAAATA TCaWJAACCAG GAGTGAAAGC 1260 

GAAGTTCGCA TATATTATQA TTAC^CTGAT 1320 

ATATGTACAA A6TGTAAATA AArrGTrTCTT 1380 



40 
45 
50 
55 
60 
65 
70 
75 
80 



1 11 21 31 41 51 

I I I I I I 

MQFtn:.TLAKL MNBLHGQES KNSCaiRSOGP GKNrn.HNEF DTIVI.PVLYL IIFVASIUJI 
eiAVWIFFHI HNKTSFIPYI. KNIWADLIM TI<TFPPRIVH DAGFGPWyPK FILCBYTSVI. 
FYANMYTSIV FLGI.CSIDRY UCWKPPGDS RMVSITPTKV LSVCVWVIMA VLSLTOIIbT 
NGQPTEDNIH OCSKUCSPI.G VKWHTAVTYV NSCLFVAVliV ILlGClflAIS RYIHKSSRQF 
ISQSSRKHKH HQSIRWVAV PFTCFLPYHI. CRIPFTPSHI. DRLLDESAQK liYYCKEITL 
PLSACNVOJl PIIYFFMCRS FSRRI.FKXSM IRTRSESIRS LQSVRRSEVR lYTOYTDV 

Seq ID NO! 153 DNA sequence 
Nucleic Acid Accession It: DSOQOS.l 
Coding sequence: 149-739 



: AAAGCGCGGA GCGGAGGCCG 



AGGCGAGAGC CTGGCGCTGT 
ACCATTTTGG CX3TGAGAGCT 
GTTCTGCGAA AAAGCCATGO 
lAGGGCAACT GCCrOCCTTC AACSA06ATG 



AGGACTAGAA 60 
GGIGGTTGGC 120 



GTCAGGTOaA 
AAATCQACGC 
ATGGGAATAT 



C6AA6TQATT 
TGCACTGTAG 
GGTAGCGTCT 
AATAATTATA 
GACATTACAC 



GTQTCTAAAA 
AAATAGCCAG CACCTTTTAC 
GQAGCSVCATC CTOTCATGAC 
CTCCTCTGTA CTCACTCTCT 
TTAAGATAAC 
TTTTTTAATG 
GTTTTGTAGA GACTGTCTCA 
AGTCCTCCCA CCTTAGCTTC 
CCCCTACTCC TTTTTCTAAT 
GTQTaTTTTT TAAATOAAAG 



CTTTGTATGA 
TGATACCAAC 
CATACCTGTA 
TGCCAAATGC 
T^AAQATCTCT 
AG6ATATGAA 
AATTTGAAGT 



TATCAAATTT 
TGACCGCTTG 
ATTACGATTT 
TQCTACTTAT 
ACCACCAAAA 
TGAtGATGGC 
ATGTGAGCAG 



CGACACTGTT 
CTTOSQATCA 
CACATGGCTQ 
ATGAGGTCAC 
AGCCTATATA 
ACTTCAGTCC 



CCAGCACTCC 
TAAGAATACT 
TTGTACACTA 
CTATGTTGCC 
TCAAAGTGTT 
AAGCTGTATC 
TAAACAT6GT 



CTTCACCTCC 
TGGCTAAGAA 
TTCTTCCTAC 
CAAGCTGGTC 
GAGATCACAQ 
TGTAATCACA 
TACATTTGAA 



GGCTTCACTC 
CTCTTTQATT 
GTATAATTTG 



TCAAACTCCT 
GOGTQAGOCA 
QCATTCCTAC 



TTAGAAGCTA 
CTAACTATTA 
TTTTGGTTTT 
GGCCTCAAGC 



CATTTTCAAA 

TGGTCTGTAO AAATTTTCAG TATATATAAT GTTTAATOAC 
TATTTGOGAA GQAAGGACAC ACATGGATTT TGCACATTTC 
CTTGTGGCTA TGGGGTGATC ACCAGTATCA 
CTAGAGAAGG AACTTTGTAC AGTTTTCCCT 
AGAGTTGATT GTCTTTTAAT GGTATGTTTT 
TCCAGTTTAT TCGTTTGTTC TTTT ATGC TT 

TCCCAASATC ACAATTTTTT T TCLTir JTA 

TACrrTGGTC TATOACCOQT TTTTTTTTIT GTTrTGTTTT 



TTCTGGTCAT 
TCACATGCAA 
ATACTAATTT 



GAGATTCAOA 
AAACAGCTGA 
TGQGTGTTGC 



AGGGQACAGT 
TTGACTGAAA 
CATTTTAAAT 
ATCCGAGAAA 
TOTTATA ATT 
GTTTTTT0C3T 



: TCTATCCCCT C 



TcrerCACCC agociggggt gcagtcggct gatcttggct 



CrCTGTTAAO 360 

GAGCACTCAG 420 

CTGAAGAAAT 480 

TGGQAGGAGA 540 

TTGAAGTCCS3 600 

TATTAAAAAA 660 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 

1620 
1680 
1740 
1800 
1860 
1920 
1980 



AGTTGTTACA 
AAGCAGTCAC 
qtgtkttqia 
OTGAAGATOA 
ATCATCTGGC 



GAAATTGGGG 
AGTCACATGA 
TTTGATGAAA 
TCTTTTCCCA 
TTAAGCTTTA 
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TTTTACCATG TTGGCCAGGC 
CXICAAAGTTT TGGGATTACA. 
GAATTTTTTA TATGGTGCAA. 
TCCAGCTGTT TCACTACCAT 
TTTGTTAAAA AGTAGTTGTC 
ATTQACCTGT TTTTCTCTCC 
TCTAATAATT CTTSAAACAG 
TTTGTAGAGA TGGGGTTTCA 
ATACACTTGC CTCGTCCTCC 
CAGTGTACCA CATTTCTTTT 
GTQAAATTTG GGAACAGGCA 
GGCCTAGATG GGTGGATCAC 
AACTCCGTCT CTACAAAAAA 
C3VC3\GTTACA CGGCAGGCTG 
GIXSAGCTQAG ATCACACCAC 
AAAGAAATTA GGATCAATTT 
CACCTTGATT OAGATTGCAT 
AATATTOAGT CTrCTGGCCT 
TCTATTTCTC TTAATAATCT 
CATAGTTTTG ATGCTAAATG 
AATAOA AATA CAATTGATGT 
ATQaTQTTTT TGTAAATTAC 
TTC 



TGGTTTCAAA 
AGTGTGGGCC 
GGTGTCAATC 
TTTTTGAAAG 
AATGTATATG 
TGAATGCCAA 
ATA6TATTAA 



CTCCTGACCT CAAGTGACCC 
ACCGCGGCCA GCCTATGATC 
CACCrtCACT TTTTCTTGGG 
GACTGCCCTT TGCTCTATCA 
TGGGTTTATT TCAGGACTCT 
TACCATATTT GXATGTAGTO 



ACCTTGGCCT 
CATTTTGAAT 
AATATAGATA 



CCATGT6CT6 
TGAGATTTGT 
GGGTGTGGTG 
TTGAQCTCAG 
TAGAAAAAAT 
AGGTGGGAGQ 
TGTACTGCAG 
GTCAATTTCT 
TQAATTTATA 
r ATAAACAAGQ 1 



CAGGCTGTGT 
GGATTACAGG 
TTTGGCTATG 



GTAATCCTAG 
CCAGCCOSGG 
TGGTGGTGCA 
CCCCAGAGGT 
AAAGtGAGAC 



GTTTTGTTCC 
TATGT AATrr 
GTTTGTATTT 
ASCTAAAGCA 
GGTGCTGGCC 
TGCTTTTGAT 
AACTTTGGGA 
CCTATGGCAA 
TGCCTGTAGT 
CAAGACTGCA 
TCTATCTCAA 
CCCIGTTBGG 
ACAICTTAAT 



TCAGTCIACA GGTCZACCAT GTCA6CATTT 
ATTTCAAATT CTAACCACTT GTTGCTAGTA 
TCCTTCAGCC TTGCTAAACT GTGAGTTCTC 
ATGTGTTCTA TGAATARAOA GTTTTACTCC 



ateln 
«i BAA11503.1 



21 



31 



41 



51 



I I 
QTTCGGCGCC AAAGCGCGGA 
CGAAAGGA6T GAGGCGCCGA 
AAGGCCGCGG GAGTGGGAAG 
CGAGCroCAT CGCGCGCCCS 
AGTTCraGAG GAGATGAAAG 
CTCAGGTGOA CSAAGTQATT 
AAATCGACGC TGCACTQXAG 
ATQGGAATAT GGTASOOTCT 



31 



41 



51 



TGAAQGTTT6 
ATGCA6TGGC 
CAACCTCCAC 
GCACTTCAGT 
AGCTGATCAG 
CA GGCTTC AC 
CCCTCTTTQA 
AAG TATAATT 
ACTCTTTTTT 
TCCCAAACTC 
AGGCGTGAGC 



OACATTACAC 



CTCCCAGGTC 
CCTATTAAAA 
ACAAGGAGTC 
TCAACTCATG 



TTTTCTGGTC 
AATCACATGC 
ACATACTAAT 
TCCACCATGQ 



ATGTGTATTG 
AAGTGAAGAT 
TTATCATCTG 



AAAGATCTCT 
AGGATATGAA 
CTCAACCTGC 
CGGTGTCTAA 
AAAAATAGCC 
CTGGAGCACA 
GACTCCrCTG 
lATAGACATT 



GCAGTCCTCC 
6GCCCCTACT 
CAGTGTGTTT 
ACTTGGCTGG 



ACCACCAAAA 
AACCTCCACC 
AAGACTATGG 
A6CACTTTTT 
TCCTGTCATG 
TACTCRCTCT 
GTTTAAOATA 



AGCCIATATA 
TCCCAGGTTC 
AGAATTTGAA 
ACCTCGATGG 
ACCATGCGCC 
CTCCACCACT 
ACTAM»ATA 



GTGCAGTGGC 
TGTCTCAGCC 
TGXATTTTTA 



OTOAAATTGQ 
AAAGTCACAT 
ATTTTGATGA 
AATCTTTTCC 
TTTTAAGCTT 
GTTTGTTTCT 
GTGATCTTGG 



QA6ACT0TCT CACTATOTTG CCCAAGCTG6 
CACCTTAGCT TCTCAAAGTG TTGAGATCAC 
CCTTTTTCTA ATAAGCTGTA TCTGTAATCA 
TTTAAATGAA AGTAAACATG GTTACATTTG 
ACAGGAAGAA GGTAGATCCT GTGTGTCTTG 
AGAGCTGAAT TTCTGAGATA CACATTTTCA 
AGAAATTTTC AfiTATATATA ATGTTTAATG 



ATTCGTTTGT TCTTTTATGC 1 
TCACAATTTT 1 
TCTATGACCC G 
ATGGAGTCTT G 
TCTCTATCCC C 



r GTTCTGTCAC C 
: CTGGGTTCAA G 

: CGCcaoGCCT c 



CAGCCTA TOA 

TTTGCTCTAT 
TTTCAGGACT 
TrerATGTAQ 
ATTTTT6CTG 
GTTQAACTCC 



TCCATTTIQA 
GGAATATAGA 
CACCTTTGCA 
CTGTTTTGTT 
TGTATGTAAT 
TTGTTTGTAT 
7GAGCTAAAG 



TTTCTAATAA 
TTTTTGTAGR 
CAATACACTT 



TTTGGOATTA 
TATATGGTGC 
TTTCACTAOC 
AAAGTAGTTQ 
GTTTTTCTCT 
TTCTTGRRAC 
GATQG6QTTT 



CAASTGTGGG 
A AGOTGTC AA 
ATTTTTTBAA 
TCAATGTATA 
CCTGAATGCX: 
AGATAGTATt 
CACGGTGTTG 



TQTTAAGTCC 
CTGTAATOCT 
GACCAGCCCG 



2040 
2100 
ai£0 

2280 
2340 
2400 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
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MPCEKAMBLI RELHRAPEGQ LPAFNEDQLR QVLEEMKAI.Y EQNQSDVNEA KSGGRSDI.IP 
TIKFRHCSLIi RNRRCTVAYL YDRLLRIRAL RHBYGSVLPN ALRFHMAAEE KENFNMYKR8 
LATVMRSLGG DEGLDITQDM KPPKSLYIBV RCLKDYGEPE VDDGTSVLIiK KNSQHPLPRW 
KCEQLIRQGV LEUILS 

Seq ID NO: 155 DNA sequence 
Nucleic Acid Accesalon it: r 
Coding aequeace: 149-709 



I I i I 

GCGGAGGCCO AGGOGAGAQC CTGGCGCTGT AGGACTAQAA 
6AGCCCAGAT ACCATTTTGG GQTGAGAGCT GGTGOTTGGC 
CGTCCGCCAT GTTCTGC6AA AAAGCCATGG AACTQATCCG 
AAGGGCAACT GCCTGCCTTC AACGAGGATG GACTCRGACA 
CTTTCTAT6A ACAAAACCAG TCIGATGiaA ATQAAGCAAA 
TCATACCAAC TATCAAATTT GGACACTGTT CTCTGTTAAG 
CATACCTGTA TQACCGCTTQ CTTCGQATCA GAGCACTCAG 



TGCTACTZAT ATGAGGTC3U: TGGQAGGAGA 540 



GTTGATGATG 720 

AAATGTGAGC 780 

GAGGCACTTC 840 

CCCTTCACCT 900 

ciroaciAAa 960 

1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



AACrOCTGAC 
CCACCX3CGGC 
TCCACCTTCA 
AGGACTGCCC 
TGTGGGTTTA 
AATACCATAT 
AATGTQTCAT 



2160 
2280 




G6CCTATGGC AAAACTCCOT CTCTACAAAA AATJUSAAAAA ATTAGCCMSQ 



245 
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TC3TGGTGGTG CATGCCTGTA. QTCACAOTTA CACGGCAGGC TGAGGTGGGA. GGATCACTTG 2880 

AACCCCAGAG GTCAAC3ACTG CAGTGAGCTG AGATCACACC ACTGTACTCC AGCCTGGGTG 2940 

ACAAAGTGAG ACTCTATCTC AAAAAGAAAT TAGGATCAAT TTGTCRATTT CTACAACMC 3000 

AACAAC3VAAA ACCCXTTCTTG GGCACCTTGA TTOAGATTGC ATTGAATTTA TATAAARCTG 3060 

TTGGGAGAAT TGA CATC TTA ATAATATTGA GTCTTCTGGC CTATAA ACAA GGTCTQTCTT 3120 

CCTAGGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCAOTGTA 3180 

CAGGTCTACC ATCTCAGCAT TTCATAGTTT TGATGCTAAA TGGTAITTTA AAATTTCAAA 3240 

TTCTAACCAC TTGTTGCTAO TAAATAGAAA TACAAITGAT GTTGAACTTO TA TCCTTCAQ 3300 

CCTTGCTAAA CTGTGAGTTC TCATGGTGTT TTTGTAAATT ACATCAACAG TCRTGTGTTC 3360 
TATOAATAAA GAGTTTTACT CTTTC 



b RNRRCTVAYIi YDRLUIIRAL RHEYOSVLPN ALRFBMAAEB MBHFNNyXRS 
LATYMRSLGG DEGIiDITQOM KPPKSIjYIEA GCSGAISAQP ATSTSQVHUI CNItHLPOPVS 
KRLWRI 

Seq ID NO: 1S7 DNA sequence 
KucleiC Acid Accession U : Eos sequence 
Coding sequence: 148-621 

1 11 21 31 41 51 

I I I I I I 

TTCGGCGCCA AAOCQCGGAO CGOAGGCCGA OGCGAGAGCC TGGCGCTGTA GGACTAGAAC • 
GAAAGGAGTG AGGCX3CCGAQ AGCCCAGATA CCATTTTGGC GTGAGAGCTG GTGGTTGGCA 
AGGCCGCGGG AGTGGGAAGC GTCCGCCATG TTCTGCGAAA AAGCCATGGA ACTGATCCGC 
GAGCTGCATC GCGCGCCCGA AGGGCAACTG CCTGCCTTCA ACGAGGATGG ACTCAGACAA 
GTTCTGGAGG AGATGAAAGC TTTGTATGAA CAAAACCAGT CTGATGTGAA TGAAGCAAAG 
TCAGGTGQAC GAAGTGATTT GATACCAACT ATCAAATTTC GACACTCTTC TCTGTTAAOA 
AATCGACGCT C3CKCIGXAGC ATACCICTAT OACCGCmC rrCQGKICAa AQCACrGASA 
TGQGAATAT6 GTA6CGTCTT GCCAAATOCA TTAOSATTTC ACATGGCTGC TQAAOAAOTC 
OOaTGTCTAA AAQACTATGO AOAATTTGAA GTTGATGATG GCACTTCAOT CCTATTAAAA 
AAAAATAGCC AGCACTTTTT ACCTCGATGG AAATGTOAGC AGCTGATCAQ ACAAQOAOTC 
CTGGAGCACA TCCTGTCATG ACCAaTGOGCC GAGGCACTTC CRGGCTTCAC TCAACTCATG 
GACTCCTCTG TACTCACTCT CTCCACCACT CCCTTCACCT CCCTCTTTGA TTTTAGAAGC 
TATAGACATT GTTTAAGATA ACTAAOAATA CTTGGCTAAG AAGTATAATT TGCTAACTAT 
TAAGGACTTT CTTTTTTTAA TGTTGTACAC TATTCTTCCT ACTCTTTTTT GGTTTTGQTT 
TTQTTTTOTA GAGACTGTCT CACTATQTTQ CCCAAaCTGQ TCTCAAACTC CTGGCCTCAA 
GCAGTCCTCC CAiCCTTAGCT TCTCAAAGIG TTGAOATCAC AGG08TQAGC CACTGCACCC 
GGOCCCTACT CCTTTTTCTA ATAAGCTGTA TCTGTAATCA CAGCATTCCT ACAGTTCKrTA 
CAGTQTGTTT TTTAAATGAA AGTAAACATO 6TTACATTTO AATCTCTTAA ATAA6CAOTC 1080 
ACTTGGCTGQ ACaGGAAGAA GGTAGATCCT GTGTOTCTTO TTTTCTGeTC ATGTGTATTQ 1140 
TACAAGCTAO AOAGCTGAAT TTCTGAGATA CACATTTTCA AATCACATGC AAGTGAAGAT 
GATGGTCTGT AGAAATTTTC AQTATATATA ATGTTTAATG ACATACTAAT TTATCATCTG 
GCTATTTGGG AAGGAAGGAC ACACATGOAT TTTGCACATT TCCACCATGG TGGCTGGTGT 
GGCTTGTGGC TATGGGQTGA TCACCAGTAT CACCACTTTG GAAGGGGACA GTGAAATTGG 
GGCTAGAGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA GATTGACTGA AAAGTCACAT 
QAAGAGTTGA TTGTCTTTTA ATGGTATGTT TTAAACAGCT GACATTTTAA ATTTTGATGA 
AATCXaCTTT ATTCGTTTGT TCTTTTATGC TTTGGGTGTT GCATCCGAGA AATCTTrrCC 
CATCXXAAGA TCACAATTTT TTTTCCTTTT TACTTCTAGA AGTGTTATAA TTTTAAGCTT 
TATACTTTGG TCTATGACCC GTTTTTTTTT TTGTTTTGTT TTGTTTTTTC GTTTGTTTCT 
TTCTTTTGAG ATGGAGTCTT GTTCTGTCAC CCAGGCTGGS GTQCM3TGGC aTGATCTTGG 
CTCACTGCAA TCTCTATCCC CTGGGTrCAA QTGRTTCTCT TGTCICaUSCC TCOCAASTAfi 
CTGGGATTAC AGGCACAGGC CGCCACGCCT GOCTAATTTT TGTATTTTTA GTAQAGACAG 
AGTTTTACCA TGTTGGCCAG GCTGGTTTCA AACTCCTGAC CTCAAOIQAC CCKCCTTGGC 
CTCCCAAAGT TTTGGGATTA CAAGTSTGGG CCACOGCGQC CAGCCTATGA TCCATTTTOA 
ATGAATTTTT TATATGGTGC AAGGTGTCAA TCCACXTTTCA CTTTTTCTTa GOAATATAOA 2040 
TATCCAGCTG TTTCACrACC ATTTTTTGAA AGGAC TGCC C TTXGCICTAT CACCTTTGCA 2100 
TTTTTGTTAA AAAGTAGTTO TCAATGTATA TQTGG6TTTA TTTCAQGACT CTGTTTTQTT 2160 
CCATTOACCT GTTTTTCTCT CCTQAATGOC AATACCATAT xwi-ATOT AG TGTATQTAAT 2220 
TTTCTAATAA TTC7TGAAAC AGATA6TATS AATST6TCAT ATTTTTQCTQ TTGTTTGTAT 2280 
TTTTTGTAOA QATGGCGl'i'r CAOOG>TGTT6 6CCAGGCT6T GITeAACTCC TQAQCTAAAO 2340 
CAATACACTT GCCTOGTCCT CCCCATGTGC TOOaATTACA GGOGTQftOCC TT O GIOCTGO 2400 
CCX3\GIGTAC CACATTTCTT TTTOAGATTT GTTTTGGCTA TGTTAAOTCC TTTOCTTTTG 2460 
ATGTGAAATT TGGGAACAGG CAGGOTOrGG TGGCTTATGC CTGTAATCCT AGAACTTTGG 2 520 
GAQGCCTAGA TGGGTGGATC ACTTGAGCTC AGGACTTCCA GACCAGCCCG GGCCTATGQC 2580 
AAAACTCCGT CTCXACAAAA AATAQAAAAA ATTAGCCAGG TGTGGTGGTG CATGCCTGTA 2640 
GTCACAGTTA CACGGCAGGC TGAGGTGQOA GGATCACrro AACCCOVGAG GTCAAOACTG 2700 
CAGTGAGCTG AGATCAO^CC ACTCTACTCC AGCCTQGGTG ACAAAGTGAG ACTCTATCTC 2760 
AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC AACAACAAAA ACOCCTGTTG 2820 
OGCACCnrSA TTOAGATTGC ATTGAATTTA TATAAAACTQ TrGGOAOAAT TGACATCTTA 2880 
ATAATATTGA GICTTCTGGC CTATAAACAA U UT CItJ' l 'C- ff OCXAGGTATT AATOTTTTOT 2940 
CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCM3TGTA CAGGTCTACC ATGTCAQCAT 3000 
TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA TTCTAACCAC TTGTTGCTAO 3060 
TAAATAGAAA TACAATTGAT GTT6AACTTG TATCCITCAG CCTTGCTAAA CTGIGAOTTC 3120 
TCATGGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC TATGAATAAA GAGTTTTACT 3180 
CCTtC 
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MFCEKAMELI REUIRAPEGQ LPAFNEOGUl QVLEENKAI<Y EQNQSDVNE& KSGGRSDLIP 
TIKFRBCSU. RtnUtCTVAYL YDRUAIRAL RWEY6SVLPN ALRFHMAAEB VBCLKDYGEF 
EVDDGTSVIiL XKNSQHFtPR RKCBQLIRQG VI.BHILS 

Seq It) liOi 159 ONA sei^uence 

Nucleic Acid Accession #: Bos sequence 

coding sequence: 149-229 



PCTAJS02/12476 



11 



21 



31 



SI 



I I I I I 

GTTCGGGGCX: AAAGCGCGOA GCaOAGGCCG AG6CGA6AGC CTGGOQCIGT AOQACTAGAA 
CGAAAG6AGT GAGGCGCCGA GAGCCCAGAT ACXATTTTGG CGTGAGAGCT GGTGGTTGGC 
AACiGC0GCXX3 GAGTGGGAAG CGTCOGCCAT GrrCTGCGAA AAAGCCATGQ AACTGATCCG 
CGAGCTGCAT CGCGCGCCCG AAGGGCAACT GCCTGCCTTC AACAATTAGC TGGGTGTGGT 
GGCACACACC TGTAGTCCCA GCAACTTAGG AGGCTGAAGT GAGflfiGATTG CATGGCTCCA 
GGAAGTTGAA ACTGCAGTQA ACTGTGGTCA CGCTATTACA CTCCAGCCTG GGTGACAGAC 
TGAATCCCTG TCTCAAAAAG GAAAAGGAGG ATGGACTCAG AGAAGTTCIG GAQGAQATGA 
AAGCTTTGTA TGAACAAAAC CAGTCTGATG TGTTCTCTGT TAAGAAAXCG P 
QTAGCATACC TGTATGACO} CTTOCTTCGG ATCAGAGCAC TCAOATGG 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



I I I I 

ATGTTCTGCX! AAAAAGCCAT GGAACTGRTC ~ 
CTGCCTGCCT TCAACAATTA G 



seq 10 NO I 161 OKA sequence 
Nucleic Acid Accession fli 1110694 
Coding sequence: 1333-22S0 



11 



AGGAACCTAA 
TTTGCCCTGC 

cxrrrrATCCT 

CACGTCAGCA 
CCCACTCACC 
CCGCTGGGTG 
AGGTCAACAG 
CTCAOCCCRG 
AACCCTGGGC 
ATGTGAGCTC 
CAGGAGAAAG 
AGAACTCAAG 
CAGTCIGCAG 
CTTGGTCTGA 
TGAAGGTGAA 



GQATCTCAGG GAGGTGAGGA CTTTGTTCTC 
CCTGTGTTCG ACAGACACAG TGGTCCCAGG 
GGGAGGATCG AGGGTACCTC CAGGCCAGAG 
CCCTACTGTC ACCCCAGAGA GCCCGGGCAG 
GGGATCACTG GTGTCGGGGA GGGCTGGCCT 
GAQGGAGGGT CCCA6GCCCT OCCAQQAGTC 
AAACACAGAG 6ACCTAGCCC CACCCTGCCC 
GATGGACTCC CCTCACTTCC TCTTCAGGWJ 



GGCTGTCTGC TGAGGTCCCT 
TGGTCTGAGG GGGCTGCACT 
CAGGrGCAGA CTGAGGGGAC 



TCTCCTGQAG ATAGGQCCTC 



GACACATGGA 
AG6TGTGGGC 
CTQATCTGAG 
QTCAGGGCCC 
AGTGTCCAGC 
CCTAAGGGCC 



CCCCATTQAA 
AGATGTTQGT 
AGACTCTCAG 



GCCTTTGTTA 

ACCAGAGTCA 
GAAGCCCAAG 
GAGACTACCT 



GTQTTCACCC 
aCAOCTOGCC 
CACGCTGAGT 
CAGQAGCCCC 
GAACCTCCAA 
TGTGQGTCTC 
TCATGTCTCT 
GAGAGGACTT 
CCTCCTCTGA 
GTCCTCAGGQ 



CCX3CCCTCTT 
CCTCGATTCC 
TCAGGTOGCA 
TGAAT6IGCA 
CCATTCCCCC 
A6CCCTCTCA 
AAGAGGCCCC 
GGTTCGGTTC 
CATCGCCCaVG 
CQAGCAGAGG 
GGGCXTTGATG 
CAGCAAGGAG 



CAGAGGGGAC 
GACAGCACTG 
TCTTCCaGGA 



CCTTCT6TTC CATATCAQGG 
GAGTAGAGTC CAGTCCCTGC 
CATCCACCCC AAAAGTGTGT 
AGGGACCGGG GCTCTGCCTG 
GCTCCAGGAA GCAGGCAGGC 
GACCCAGQCA GTGTCAGCAG 
CACCTGOCCC AGCACACATG 
CATAGAGCCT TGATCTCTGC 



AGTCCGCACT 



TGACGAAGAC GTGTAAGTCA 
TCTCTCACAC ACTCCCTCTC 
OGCTCCTGAC TGCTGCCCTG 
GCAAGCCTGA TGAAGACXTTT 



CTTGTCACTG 
AAGGCCGCCC 
OAAGAGGTTA 
TTCTAGGGGG 



TQQAGTTCAT 
TCCACAAATA 
AAAATTACAA 
TCTTTGGCAC 
CTCTTGGCCT 
TCXTTGATCAT 
TCTGGGAAGC 



GCGCTACTTT 
TGATGTGAAG 
CTCQTQCGAT 
TQTCX^TGGGT 



OCCATCTGCT ACCCATCCCT 
GCACCAGCCO 
GCCCCATGTG 
CAQTGGCAGT 
GIQATTTQGA 
AGTTTAATGA 
ATGTTATITA 



GGGTGQAAGT 
GATTTATCCT 
ACTTCACCAT 
GGAQTAAOAT 



TTAT GAAGAO 
CAAAGTTT6T 
CATTCTTCGC 
6AGCACACTG 
TGCTCCCTTT 
CQAAOTTAAT 

TCTTGcrrrr 



gaoccsstca caaaggcaga aatgctggag 
octgtcatct tcggcaaagc ctcogagttc 
gaggtggacc ccgccx3gcca ctcctacatc 
agcatgctgg gtgatggtca tagcatgccc 
gtgatcx:taa ccaaagacaa ctgcgcccxt 
atgggggt6t atgttgggaa gqagcacatg 
caagattgoa tccaosaaaa cracctqgao 
cactaooagt tccrgtsggg ttccaag6cc 
aattatttgo tcatqctcaa tocaaqaqao 



CAQCAAAATA GAGCTCATAA AGAAATAGia 
ACCTCTTTCT CTCTCCTQTA AAATTAAAAC 
TCTTTGAGCA TGTAAGAGAA ATAAAAATTQ 
TTTTTTCTTC AGACACGCAC TGAACATCTG 



06GGTCAGGG CCOCATCCA6 CAGCTGCCCT 
TCTGTQTTTG AAGAQA6CAA TCAGTQTTCT 
TATGTCATCT CTGGGTTCCT TGrCTATrGG 
TGGAATTGTT CAAATGTTCT TTTAATGGTC 
GAATGACAGT AGTCACACAT ATTGCT6TTT 
GAQTCACATG GGGAAATOCC TGTTATTTTG 
ATTAATAATT TTTTTGAAAC TTqAACTTAQ 
AAATQAAAAT OTAGITAATT CTTGCCITAT 
ATATACATGT ATACCTGGAT TT6CTTGSCT 
AAAOAATAAT TTTTCCTOTT 
TTATTOGQAA CACCCTQGGT T 



1020 

10 ao 

1140 
1200 
1260 




1980 
2040 
2100 
2160 



2460 
2520 
258Q 
2640 
2700 
2760 
2820 
2860 



85 
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1 11 21 31 41 51 

I I I I I I 

MSLEQRSPHC KPDEOLEAQG EDLGLMGAQG PTGEEEETTS SSDSKEEEVS AAGSSSPPQS 60 

PQGGASSSIS VYYTI.WSQFO EGSSSQEEEE PSSSVDPAQL EFMFQEAIjKL KVAELVHFW- 120 

HKYRVKEPVT KAEMIiESVIK NYXRYFPVIF 6KASEFKQVI FCTDVKEVDP AGHSYILVTA 180 

LGLSCDSMLG DGHSMPKAAIi LIIVLGVILT KDNCAPEBVI HEALSVMGnTY VGKEHMFYGB 240 

PRKLLTQDHV QENYLEYROV FGSDPAKXEP LHOSXAHAET SYEXVINYLV MLNAREPICT 300 
PSI.YEBVL6B EQEGV 

Seq ID NO: 163 DNA sequence 
Nucleic Acid Accession tfs AF292100 
Coding sequence: 30-809 

1 11 21 31 41 51 

I I I I I I 

GGGGGGGGAG RGC3CCTGC5AG GACACCAACA TGAACSiAGTT OAAKTCATOG CRGAABOATR 60 

AAGTTCGTCA GTTTATGATC TTCACACRAT CTAGTGAAAA AACAGCAGTA AGTTGTCTTT 120 

CTCAAAATGA CTGGAAGTTA GATGTT6CAA CAGATAATTT TTTCCAAAAT CCTGAACTTT 180 

ATATACX3AGA GAGTGTAAAA GGATCATTGG ACAGGAAGAA GTTAQAACRa CTGTACAATA 240 

GATACAAAGA CCCTCAAGAT GAGAATAAAA TTGGAATAGA TGGCATACAG CAGTTCTGTG 300 

ATGACCTGGC ACTCGATCCA GCCAGCATTA GTGTGTTGAT TATTGCGTGG AAGTTCAGAG 360 

CAGCAACACA GTGCGAGTTC TCCAAACAGG AGTTCATGGA TGGCATGACA OAATTAQQAT 420 

OTtSVCAGCAT AGAACAACTA AAGGCCCAGA TACCCAAGAT GGAACAAGAA TTGAAAOAAC 480 

CAGGACGATT TAAGGATTTT TACCAOTTTA CTTTTAATTT TGCAAAGAAT CCAGGACAAA S40 

AAGQATTAGA TCTAGAAATG GCCATTGCCT ACTGGAACTT AGTGCTTAAT GOAAGArTTA 600 

AATTCTTAGA CTTATGGAAT AAATTTTTGT TGOAACXTCA TAAAOQATCA ATACCRAAAQ 660 

ACACTTGGAA TCTTCTTTTA GACTTCAGTA COATGATTQC ASATOACATG TCTAATTATG 720 

ATGAAGAAGG AGCATGGCCT GTTCTTATTG ATQACTTTGT GGAATTTGCA C!GCCCTCAAA 780 

TTGCTGGGAC AAAAAGTACA ACAGTGTAGC ACTAAAGGAA CCTTTTAGAA TGTACATAGT B40 

CTGTACAATA AATACAACAG AAAATTGCAC AGTCAATTTC TGCTGGCTGG ACTGAACTQA 900 

AGATCAATCC TCACAATTCA GACTGAGGGT TOAOACAAAA CTTTAAGQAT ACATCTTGGA 960 

CCATATC8TA TTTCATTCTT CTAATGOIGG TTTGQGCTTG TCTTCTAGTC TGGGCOGCTC 1020 

TAAACATTTA TAATTCCAAC ATTaTGGRTT TCATCTTATA TCIGIGQACC ATOCTAGTTT lOBO 

ATTCTCCCRT AAGTCrrAOA AGCTTTATGa TGATTATTTT OAGGTrrCCA TTCTOOCATA 1140 

AAGCACAATG CTSTCTTCAT CAQAAAACAQ TTGGCATAAa AATTAAACAT ATGAACATCA 1200 

CAAAACAATT TATAAAAACT TCTTAAATAT AOGCTTTGGQ CTAGTTGCAA AGACTATGCT 1260 

AATAQCACTT CCAOTGAGAG TGATATATTT AAGTGTACTG GATCTGGAAT GGTGTTTTGG 1320 

TTTGGGGGGA ATTTTTTTTT TTTCCTGGCA AATCACATAT GTTGTTGATG TGAGTATCTG 1380 

ATGAAAAAAC AATGTCAGAA TAACCGACAT GAAAATTTTT TAGGATAACT TGGTGCCTAC 1440 

CTGAAAAATG TATTOTGTTT TAGACTCTTQ ATTTCAAAAG OTTCCACAGA ACTAGTCTGC 1500 

GCTTACCTTA CCC&TGTTTA TATATAGCTG TCCTACAGGG AGCTTTTATT TAOARAATOT 1560 

CTGCATAATG TTAGATTCTT CTCCTGTCTA CATTRTQCRC TACATAATTQ OACTTCATTR 1620 

TGCTTTTGAA ATOCTTATCT GCCIGTCACA TAAGTTAAAC TATTT AATTT 6TTTTGAAT6 1680 

TTTTGGATTO CTACACAATA CAATATTCTA AATTTAGGCA TGAGGGTTTT TTTQTTTTAT 1740 

TTTTACTTTT TTTTTGTCAT TGCACTATGQ AACACAAATG AAATTCTCTT AATTTATAAO 1800 

AAGATAGTAG QAQTTAAATT TTGAAAATGQ TTGTGATGAG CCACGAAATT CAATCTTTAT I860 

AATATAGGTA CTGCTCTTTC AGACAAACAQ TCCATrTTTA ATGACTTCTT ATTTTGTTOA 1920 

AATTACTTTA ACTQCTAATC ACTGTGGTTG CCAAATATTT ACTTCAGAAG CAAAGATTTT 1980 

CAAACAAGCA TACAOGATGC AAAATACCAO TCTGGCTTCT AGTCTATTTA CTGTTTTGTT 2040 

TCACTCRGAT TAGCTCAGTT TICTCAICAA AGCAGAATGC TATCTTGOGT GTGTGTGTGT 2100 

GTOTOTSTOT aTOTCrrSTQT aTATGTeTGT ATATATATAT ATATATATAT AXATATATTT 2160 

Tl ' mTf TTT TTTTTTTTAA ATTACAAAAa CCATSAGCTG CT TTTA TGCT GAAAATGGTC 2220 

ATTTCCCTGT TCACTTACTO ACATGTGAAO AAGGSITTCT TGCTTTCTTA AACATTTCCG 2280 

TAAGGCAGGC TAGAAATGTA ATACTTCAAA TGTTTGATGA TTATGGTCTT TTGATAGGAA 2340 

TAGATTCTGC TTGGGATATA TATCCAX3GCA CTCTCTAAGQ TCTAGGGTTG ATATTAACAA 2400 

AGGAATGTAC TTAGAATAGC AGTACATTTT ATGCAAATAT GGAAATTATT TTAAGAAACA 2460 

ATGACATATC AAAACTGCTT TTTACATGAT TTTGAAATAG ACTAGAAAGC TTTCCCTATA 2520 

GACATATTAA TATTCCAATC ATAACTTTAA TTCAAGAATG CAGTTTTACC AAAAGAAAAA 2580 

TTTOAAAATT TCTATTCAGG CTACTGGAAT TGGTTATTAA AAGAAAAAGG AAAAAGAAGA 2640 

ATCTTQCIGC TTTCAGTATT TCCTGATTTT TTTGTAAATA TAAAGAGGAA CTTCAATTAT 2700 

OAAAAATTTT TAAAAGATAT ATATATCTAT ATATCTATAT ATATGTACTO TTTTGTTTCC 2760 

TGTCTTGAAG ATTTTGAGTT ATOGTTATTa GTTTCAGATT GATTAATTCA CATATGCTGT 2820 

GTTTTCrrTA AAAGTCATAT GGGTTCGTGG CCTAATGCCT TGGATTTTAC ATATTTTTCT 2880 

TTTTAAATGC AAAACCTTTT CAACAAAATA GTGTTTGTCA TCAGGTTGGT ACTAAACATT 2940 

TATAATTACT GTGTAATTAT AAACAAAAAT ACATAAAGCT TTGAATATAA TTATGTAGCA 3000 

TAAAAGTTAA GGTTGTTCAC TATGATGGCA TCTTSGAATT AAACAAAACT TTTACTAGGQ 3060 

CTGAAAAGAG AAGACTGATT TAATGTGGTG TGATTATTCT OAAGATAAAT GTCTGGCTAC 3120 

ASGQAATATT TTGTACTAAA AAATGATTAC ACATATGGCT GTGTGTGTTT QAGTCTGTGT 3180 

CTGTGAQAGA GCCAGAGAGA GTGAGABAGA TTGACAGAGA AAGGGAGAGA CACACACACG 3240 

CCOCTTQAAT TGCTTTAACT CCTAAGTGTT TCAGTCCTCA TTCCGGTAAA CTCCCCATGC 3300 

TGATTCTTTO TTTTAAACTG AAOCATAGQT ACAGTTTCCT TTTTOCCAAA TGTCAAAACA 3360 

GGTACAAATT TTAAAATGTA ATGCTTTTTA AATAGAAAAA TGTATAAAAT TAGAAGTGCC 3420 

CACATATAAA AAATACTTOA GATGAAGATT ATCTTTAaXQ AATATCATCT GCATATCTCT 3480 

GTAAOTTCAA TTGTGTTTCT TACAGTCCCT GTCATATTAC CAACAOAGQC AATAAAAOCT 3540 
GCAGTGAAAT TG 

Seq ID NO I 164 Protein sequence : 
Protein Accession 8: AAG00606 

1 11 21 31 41 51 

1111)1 

MNKUCSSQKD KVRQFMZFTQ SSBKXAVSCL SQNDWKLDVA TDNFFQNFBL YIRESVKGSL 60 

I3KXKI.GQI.YN RYKDPQDEHK XOIOaiQQFC ODLAUtPASI SVLIIAHKFR AAXQCEFSXQ 120 

EFHIXaCFEI/: CDSIEQLKAQ IPKMEQELKE FGRPKDFYQF TSNFAKNPGQ KBLOI£HAI& ISO 

YNMLVUIGRF XFLDLMNKFI. LEBHKRSIPK DTWMUJJIFS mUDDMaHY DEEGAHPVI.I 240 
DDFVEERRPQ lAQTKSTTV 
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Seq ID MOt 165 DMA. aequeace 
Nucleic Acid Accession t: AF25621S 
Coding sequence! 220-2028 

1 11 21 31 41 51 

i I I I I I 

CTCCAGTCCG CATGCTCAGT AGCTGCTGCC GGCCGGGCTG CGGGGCGGCG TCCGCTGCGC 60 

GCCTACGGGC TGCGGTGGOG GCXXiCOGCGG CACCCGGCAG GGCCCGCCAQ TCCCCGCTTC 120 

CCTOCTCCAQ AGCCGCOGCC TGGGCOGGGQ CAGGGCGGGC CrGGGGCTO: TCCATGCTGC 180 

CAGCCGCCGG 6CTGCGGA6C CGACCAABT6 GCTCCTGCtSA TCGCXSGCGQA RGRGGRGGCT 240 

GCGGCGGGAG GTAAAGTGTT GAOAGAGOAa AACCAGTGCA TTGCTCCTGT GGTTTCCAGC 300 

CQCGXBAGTC CAGQGACAAG ACCAAC6GCT ATGGGGTCTT TCAGCTCACav CATGACRGAG 360 

TTTCCAOGAA AACXSCAAAGG AAGTGATTCA GACCCATCCC AAOTGOAAOA TGGTGAACAC 420 

CAAGTTAARA TGAAGGCCTT CAGAGAAGCT CATAGCCAAA CTGAAAAGCa GAGGAGAGAT 480 

AAAATGAATA ACCTGATTGA AQAACTGTCT GCAATGATCC CTCAGTOCAA CCCCATGOCG 540 

CGTAAACTGG ACAAACTTAC AGTTTTAAGA ATGGCTGTTC AACACTTOftQ ATCTTTAAAA 600 

GGCTTGACAA ATTCTTATGT GGGAAGTAAT TATAGACCAT CATTTCTTCA GGATAAT6AQ S60 

CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 

GQAAAAATTC TCrTOGTTTC TAAGTCAGTC TCCAAAATAC TTAATTATOA TCAGGCTAGT 780 

TTOACraSAC AAAGCTTATT TGACTTCTTA CATCCAAAAG ATGTTGCCAA AGTAAAGGAA 840 

CAACTTTCTT CTTTTGATAT TTCAOCAAQA GAAAAQCTAA TAGATGCCAA AACTGGTTTG 900 

CAAGTTCACA GTAATCTCCA CGCTGGAAGG ACACOTGTGT ATTCTGGCTC AAGACGATCT 960 

TTTTTCTGTC GGATAAAGAG TTGTAAAATC TCTGTCAAAG AAGAGCATGG ATGCTTACCC 1020 

AACTCAAAGA AGAAAGAGCA CAGAAAATTC TATACTATCC ATTGCACTGO TTACTTGAQA 1080 

AGCTGGCCTC CAAATATTGT TGGAATGGAA GAAGAAAGGA ACAGTAAGAA AGACAACAGT 1140 

AATTTTACCT GCCTTGTGGC CATTGGAAGA TTACAGCCAT ATATTGTTCC ACAGAACAGT 1200 

GGAGAGATTA ATGTQAAACC AACTGAATTT ATAACCOGGT TTGCAGTGAA TGGAAAATTT 1260 

GTCTATGTAG ATCAAAGGGC AACAGCGATT TTAGGATATC TGCCTCJkGGA ACTTTTGGGA 1320 

ACTTCTTGTT ATGAATATTT TCATCAAGAT GACCACAATA ATTTGACTGA CRAGCACAAA 1380 

GCAGTTCTAC AGAGTAAflGA GAAAATACTT ACAGATTCCT ACAAATTCAG AGC31AAAGAT 1440 

GGCTCTTTTG TAACTTTAAA AAGCCAATOQ TTTAGTTTCA CAAATCSTTTG GACAAAAGAA IS 00 

CTGGAATATA TTGTATCTGT CAACACTTTA GTTnGGGAC ATAGTQAGCC TGOAOAAaCA 1S60 

TCATTTTTAC CTTGTAGCTC TCAATCATCA OAAGAATCCT CTASACAOTC CTQTATGakST 1620 

GTACCTGGAA TGTCTACTGa AACAQTACTT GSTGCTBGTA GTATTGGAAC AGATATT6CA 1680 

AATGAAATTC TGGATTTACA GAQGTTACAG TCTTCTTCAT ACCTTGATOA TTCGAQTCCA 1740 

ACAQGTTTAA TGAAAGATAC TCATACTGTA AACTGCAGGA GTATGTCAAA TAAGGAOTTG 1800 

TTTCGACCAA GTCCTTCTGA AATGGGGGAG CTAGAGGCTA CCAGGCAAAA CCAGAGTACT 1860 

GTTGCTGTCC ACAGCCATGA GCCACTCCTC AGTGATGGTG CACa«3TTGGA TTTGGATQCC 1920 

CTATGTGAC3V ATGATGACAC AGCCATGGCT GCATTTATGA ATTACTTAGA AGCAGAGGGG 1980 

GGCCTGGGAO ACCCTGGGGA CTTCAGTGAC ATCCAOTGQA CCCTCTAGCC TTTQATTTTT 2040 

AACTCCAAAA ATQAGAAACA TTTTAAAGCA TTATTTAOOA AAAAACTGTC TCAACTATTC 2100 

TTAAQTAdO TMTQATATT GTrTQTRTCT TTTATTAATG TrCIAOCACT mTKStOKt 2160 

TTGCATCTTC CTGTCACAGG GATGIGSGQA AATAOGTTIT CCTCCCAAGA GAACCAABTT 2220 

TATTATAOAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAGTTGCC AT ATTTTTG C 2280 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATTGTTT TGGCTTTGTT TTATTTTTGA 2340 

TGCAGTTTTT VrTAGTrGAG GTAATGTAAT ATATTGATGT TTTCCTTTGT GTCrAAGATT 2400 

QATTTATAAT AGTAGGTTTG TATAATTTGG AACATTTTCC ATGCCTTGCQ AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAGTGAGT ACTTGAACAT TTAAAGGGAC 2520 

AGT6CAATTT ATAGTCATAA TCACATTGAA TACrGTATTT GATCTTTGGA GACTTAGGCA 2580 

AGCACAGAGC TCGGATATTT ATGCTCAGTT GAGCACTTTA ASAIGAATTT TAAGTQAGAT 2640 

6ATTTCTTGC TTAAAACTCA OAAAQTCAAA AGAQITTCAa CTTTCCTTAC AOAAAAGGAA 2700 

GGATCTTGGG CCCTAGATCT TGGGGATTAA CCTCTGCATA TAAGATTTAC TCTTAATAGG 2760 

CCAGAOGTGQ T6CTCAOGCC TGTAATCCCA GTACTTUGGG AGGCTGAOAC OGQCA<3ATCA 2820 

CTTGAGGTCA GGAGTTCAAG ACOUSCCTGG CCAATATGQT GAAACCCCXST TTCTACTAAA 2880 

AATACftAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCACGATAA 2940 

TGACAGTCCA TTCATGAGCX3 CAAAGGCCTC ATGACCTAAT GGC31CACRCC TGTAATCCCA 3000 

ACTGCrrGGO AGGCTGAGGC GAGAGGATTG CTTGAACCTG GGAGGCAGAQ GTTGCAGTGA 3060 

QCCGAGATCG CACCACTGCA CTCCAGTCTQ GGCAACAGAG TGAGACTTCA TCTCAAAAAA 3120 

AGTAAAAAAA AAQATTTAAT ATAATCACIQ AAOATCTCTA TTATAGA3AG ATTAGGTm 3180 

TQACATTGQA AACATACTTA GGGATAGATT T6TCCTAAAG GAAAAAAGTA GGOCCGGGCA 3240 

QAXTAAATOT CTraxOTAAA GTCACACATT AAATTCASTC ACACATTAAA TTCATAGAGT 3300 

TTTAAATGTT TAATGTATAT AAAOCAGTTT CTTTATACAC ATTTGGGAAA ACATTGGTCT 3360 

CACAGATTAA ATGATTAACT AACTGACCCA GGAACTAGTT QTAGCTTTCT AAGTAATTAG 3420 

GCAATTACAG TTATTGCCTG TAACCAAAGG TAATAAAACA AAATGACRAG TACATGTTTA 3480 

AAATTATGAG GCAATGAGAA ATAATTTAAA AACCAATTTT CTAGTTATAA TTTAAAATTT 3540 

GGAGAGCATT TTTAACAGTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT AAGAATTAAG 3600 

ATTATTTAAA ATACTQCATG TCTACCTTCT CGGQGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCAGTA6CTC TTCATAGCTT GCCAAGTATG CTCCCATATT TTCTCTCTCG TGCCTCGCAA 3720 

ATGAAAfiTCA GATAGGCTGG GAACTCRTGG GGCAGCCCTC AGACITCAAT OTGGGCTTCA 3780 

AATCCAGTTT CCRSTTCTAT ATGGTGCTAC ATCTTTCCAS AAAATTTCCC TCAOAGCCCC 3840 

TCGOCAAAAC AAAGCATTAT TTTGACCCTG CATGCTATTT CTTTAGCTGT AGGTGATAGA 3 900 

TTAGAACTTC TGTCAGACAT QTTAATGACA AACATACCAA CAGACAATAA CCAAAGCAAA 3960 

TGTTTCCTTC AAGTGTGAAA TGTGCAGGGG CTCGTGGGCA AGGATOTATT GGCACACTGT 4020 

CCTCTTGTVAC TGATAGTGTC CCAGCAATGT TGGAGGTTGG CACCATTCCT GGTCCGACAC 4080 

TTOAGQACCT GAGAGACATC AGGTTTAGAA TGAGCCAAAG AAATCCTACA AGATGGGGAG 4140 

AATTGQTGTG CAGCAGCCTA AGTGTTATAG TTAAGTCIAA AGAAGTATGA AAQATCCCCT 4200 

QTGTTCTCTA AAnOAGCAG AGGGGCCTGC CTACCAATAT CACTTrTTAQ GGGACTGAAC 4260 

CATTGCflaar TAGACTTOGC TTOCAABaAQ TCTGCCTAAQ CCAGGGGTGG CAGGGTAGGC 4320 

CATCATAGCT GGATGGCCTC AAAAGCASAT GGGGGCAGAC TTGCCCTCX3T GATGCCAGQA 4380 

TTTQAGAQGC AGAGTTTCTA QAGGGAGACC AGTGCTGCCT CTCACAGTGG CAGTTTTTTC 4440 

TCTTTGCSVAG AGGAGGGGCT GTTCAATTCC ATAGACCAGT GQGCAGATAG CCAGTTGAAT 4500 

ACTCT G TGCA TGGTTTGATC CTTTATTAGT TCGCTCTAAT ATTTTTCTST AGATOCTTTT 4560 

GTCCTGGACT CAAAATCTAA TCCATGCATT GTATGATACC GTAGCTCTCC TAAGGTTTGT 4620 

GTTTCCTTCA AAATGTTTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCTG TTAGAAGTGA 4680 

CATATTTTTA TQGTAXACAC TATGTTCCTT TTTTCTACTG CGAGTCAA3T TTTTGAATTT 4740 

TOSTaAOAAA GAATATATCT ACAAATT6CA OQAAAGTATC ATAAAAACAQ TACTCTAGAQ 4800 
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CAGCGCTGTC CAATASAAAT 
ACATTAAAGA AGTAAAAASA 
CAAAATATCA TTTGAACATQ 
TTGGTAATAC TAGTCTTCAA 
GTACTAGCCA CATTGCAAGT 



AGCTGGAAGT 
GTTTAAATGG 
CCAGGCACTC 
TGTAGAGTAG 
GTGGGCGGAT 
CTCTACTGAA 
CTGCTCTGGA 
AGATCGCGCC 
AAAAAAAAAA 
GAAtSTAGACC 
AATTATTTAT 
AATTTAACCT 
CTTGQAATCA 
CAAASACAAA 
AAGTSITTAT 



AAACATAGAG 
CTTGCATGAG 
TTCCCACTCC 



ATAATCTGAG 



AATCTGGTAT 
GCTCAGTAGC 
CCTAACACCC 
ATCAAACCTC 
GTCATACAGC 
ACTACATTAC 



CCACATQTAT 
ACTAArrXTA 
AAAATTATTA 
GTATCTTACA 
CACATGTGGC 
RAGTCCTGTG 
CTTTTAAAAA 
TAAATTCAGC 
TGTAGTGaTA 
CCTGTAATCC 



ACTGCaCCCC 



AAAATTAGCC 
GAATGGOSIG 
A6CCTGGGCG 
AAGAAAAGTC 
CCATAAGGAT 
ATCTGTAAGA 
ATAAACTTTA 



AATTTTATTT 
AT G TTTT A AT 
ATGTGATATT 
TTOATAGCAC 
TAGTGGCTAC 
GATTAGAATC 
TGAGQACGCT 
CTCAACAGGG 
ATTCTTAGGG 
CAGCACTTTG 
CCAACATGGT 



ATCTCACTTT 
TGCACTGGAC 
CCAGAATCAG 
GAGGCACAGA 
TCTTCTGATT 
TTAAAAAA&G 



GAAACCCCGT 



TTACAGTGTC 
AAATCTGSGG 
TGTCCAGTTT 
CCCTGGAATT 
TATGATTAGC 



AGGTGTGGTQ GCGGQCGCCT GTG6TCCCAQ 
GCAGAGATGG 
ACTCCATCTC 
TATATTAAGT 
AAATACCATG 
AGTCTCTATC 
TGTGTTATTA 
TTTAGAAATC 
AATAAATGGA 



TCAAATAACA 
AAATATCAGA 
C ATCaC TAAA 
ACTTTTACTA 



AAAAAAAAAA 
GGTTATTATT 
TTTGAAGAAC 
CATGTTACCA 
CAGOATGATA 
TATTGTGAAA 
ATTTAAAAGA 



GATCTCCGTA 
TGTTTAGTAG 
CATCACTGAG 
GGC CGACAT C 
OCACTTTTCT 
GCCTATGAGA GGCATTTATG 
T6TTTTATGA GAAATGCTTT 
GGCTTAGAGA GCTTTCCAGG 
CAOCAOSGAA CaTOCTTTCT 
TATTCTTOAG TTTAGATTTG 
CTCQTCTGTA TATTOOTATT 
ATTTTATAAT TACTCATTTG 
GCTCCCTTAA AA 



CAGATCATGC 
GGAGACTCAT 
TAGAAGCCAT 
TGCTTTCATT 
ATTTTT6TGC 



T6TACSI60CA 
GAACTCACTT 
TCTTTT ATAC 
TTTAAATTTT 
TAflTTTTTTT 



AGAAATAGCX: 
TGCCAAGAQG 
ACTTGCCCAG 
AAAGAAAAGC 
AGCAGATCTT 
CTACAATAAG 
TCTA6GAAQA 
AXAAAAACI6 



AATATTTAGT 
TATCTCCCCX: 
CACAATGTTA 
TGCTAAG TGG 
TTTTTTCCAA 
TCRGCCTGTC 



TAAGAACTTT 
AAAACCTGCX: 
ACTTCTCATA 
TTCACTTGTG 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
S460 
5520 
5580 
5640 
5700 
5760 
5820 



6060 
6120 
6180 

6300 

6360 

6420 

6480 

6540 

6600 ' 

6660 

6730 

6780 
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11 



21 



31 



I I I I I 

MAAEEEAAAG GKVLREEaJQC lAPWSSRVS PGTRPTAMQS FSSHMTEFPR 
QVEDGBHQVK MKAPREAHSQ TEKRRRDKMM KLIEELSAMI PQCNPMARKIi 
QHLRSLKGLT HSYVGSNYRP SFLQDHELSH LILKTAEGFL FWGCERGKI 
LHVDQASLTG QSLPDFLHPK DWAKVKBQLS SPDISPKEKL IDAKTGLQVH 
YSGSRRSFFC RIKSCKISVK BBHGCLPNSK KKEHRKFYTl HCTGYLRSWP 
' NSKKDNSMFT CLVAIGRLQP YIVPQNSGEI KVKPTEPITR FAVNGKFVYV 
LPQELIiGTSC YEYFHQDDHN NLTDKHKAVIi QSKEKIIjTOS YKFRAKDGSF 
TNPWTKEIiEY IVSVUTLVLG HSBPGEASFL PCSSQSSEES SRQSCMSVPG 
SIGTBIflNEI IiDIjQRLQSSS YUJDSSPTGL MKDTHTVNCR SMSMKELFPP 
TRQNQSTVAV HSHBPIiIiSDG AQLOFDAItCD NDOTAMAAFM MYIiBAEGGLG 



Seg ID NO: 167 OHA sequence 
Nucleic Acid Accession «: NM_014400 
coding sequencei 86-1126 



KRKGSDSDPS 
DKLTVUJMAV 
LFVSKSVSKI 
SNLHAGRTRV 



DQRATAILGY 
VTLKSQHPSP 
NSTGTVIiGAG 



DPGDFSDIQH 600 



GATCIGGACT GCAQGCieGC TGCTQCTGCT GCTGCTTCGC GGAGGAGCaK: AGGCCCTGGA 180 



GTQCTACAQC 
GAA6T6C6CG 
CGGACAATTC 
OGGCCTGQAT 
CTGCAAOGCC 
ATACCCX3CCC 



TGCXJIGCAGA AAQCAQATGA CXK3ATGCTCC 
COGGGOSTGG AOGTCTQCAC COAGQCXSSrQ 
TCGCTGGC3VG TGCSGGGTTG CGQTTOGGGA 
CTTCAOGGGC TTCTGGaSTT CATCCAGCTO 
AAGCTCAACC TCACXTTCGCG GGOGCTCGAC 
AACGGOSXGG AGTGCTAOVG CTGTGTGGGC 
CCGCCGGTCG TGAGCTGCTA CAACGCCAGC 

AAOSTCSUXrr tgacggcagc taatgioact 



ctccccggca 
cagcaatgcc 
coggcaggta 

CTGAGCCGGG 
GATCAT6TCT 
GTGTCCTTGC 



AGACCATCCA 
AGAATGACa; 
CTCAGGATCG 



ACAAGGGCT6 



TGGCTCCT6T 
CCCTCGAATC 
CACATCTGTC 



TGCX»iGGGaT CCCGCTGTAA 
CCACCCCTTG TCCQGCTGCC 
ACC3«rrTCTA CCTCGGCCCC 
ACCAGTCAGft CTCOGAQACA 
TTGACTGGAG GCGCCGCTGG 



3 CCGTGGCTGC 



AAATTTCCCT 
OCCACCACTG 
CTTCTQCTGC 



TCCTCTTGTQ 
AGGATQCTAA 
GGTGGGACAA 
ATOGGTTCCC 
CTTATGTCTG 



GCTGGTTTGC 
OCTTTTTGAG 
ATGTTAGGAC 
GCTTCCTACT 
TGGCTCCCCR 
CATATGTCTT 
TGTGTQATCA 



GACABCTCCT 
AGAGTGAGAG 
CACTTTCTCX: 
CTCTAAGCAC 
OCTTACTAGA 



CTCTGACCTC 
CCCTCCAGAG 
AGTGAGACCC 
GGGAGTAGAA 
CCACCAGGAC 
TAATAAAGGC 
TGGTGTCCTA 
TQtsGTACCCC 
TGTTTTTCCA 
AATAAAATAC 
GTATCCTTCT 
AAGTCAGCTQ 
TAGCCAGCCT 
TGCCTCCCCT 
CTQTGAGCTC 
CATAAATGCC 



CCCaCGACTG 
ACATCCACCA 
CAOQAGGCCT 
CGCAGCAATT 

CTGTGAGCTT 
TCTTCTCATC 

OGTTGTATAT 
CATCCTTGTC 
TCA CX3GG GAA 
GGACTTTGGA 
ACrcCCCGCA 



TCACGCTCAQ 
CCTACTTCTC 
•IGGCCTCAAC 
CCAAACCCAT 
aXX3GQATGA 
CAGGGCAGTA 
CCACAGCTGG 
CTCCACCTGG 
ACrrCCTGTT 
GTATCCCCA6 
ATTCTGGCAG 
TCTCCX3CTTG 



TCAATAAAGA 1 



GCGTGGGGTG 
TCTTTGGGGA 
GGGACXSTGC 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
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3 AAAAAAAA 



1 11 21 31 41 51 

I.I I I 1.1 

MDPARKAGAQ aMIWTAGWLL LIJiLKGGAQA LBCTfSCVQKA rmGCSPHKMK TVKCAPGVDV 
CTEAVGAVET IHtSQFSLAVX GOSSGDPGKN DBGUJtflGLL APIQLQQCAQ ORCaiAKIja.T 
SRAIiDPRGUB SAVPPHGVEC VSCVGIjSREA CQGTSPPWS CYUASDHVYK GCFDGIIVTI.T 
AANVTVSLPV RGCVQDEFCT RDGVTGPGFT LSGSCtXJGSR CNSDUINKTV FSPRIPPIiVR 
LPPPEPTTVA STTSVTTSTS APVRPTSTTK PMPAPTSQTP RQQVEHEASR DEEPRLTGOA 
AGHQDRSHSQ QYPAXGGPQQ PHNKGCVAPT AGIiAAUiLAV AAGVLL 

Seq 10 IK) I 169 UNA sequence 
Nucleic Acid Accession MM_00fi875 
Coding sequence: 186-1190 

1 11 21 . 31 41 SI 

I I I I I I 

GAATTCGGCA CGAGOGOSCG GCGAATCTCA ACGCTGCGCC GTCTGCGGGC GCTTCCGGGC 
CACCAGTTTC TCTGCTTTCC ACCCTGGCXK: CCXXICAGCCC TGGCTCCCCA GCTGCGCTGC 
CCCGGGCGTC CACGCCCTGC GGGCTTAGCG GGTTCAGTGG GCTCAATCTG CGCAGCGCCA 
CXrrCCXTOTT GACCAAGCCT CTACAGGGGC CTCCCGCGCC CCCOGGGACC CCCACGCXX3C 
CQCCAGGAGG CAASQATCSQ GAAGOGTTOS AGaCCOASTA TCGj^CGGC CCCCTCCTGG 
GTAAGGGGGG CTTTG6CACC GTCTTCGCAG GACACOSCCT CACMATCGA CTCCAGGTGG 
CCATCftAAGT OATTCCCCGG AATCGTGTGC TGGaCTGGTC CCCCTTGTCA GACTCAGTCA 
CATGCCCACT CGAfiXSTCGCA CTGCTATGGA AASTGGGIIQC AaOTOaTaGQ CACCCTGOCS 
TGATCCGCCT GCTTGACTGG TTTGAGACAC AC3GAAGGCTT CATGCTGGTC CTCXSAGCGGC 
CTTTGCCCGC CCAGGATCTC TTTGACTATA TCACAGAGAA GGGOCCACTG GGTGAAQGCC 
CAAGCCGCTG CTTCTTTGGC CAAGTAGTGG CAGCCATCCA GCACTQCCAT TCCCGTGGRG 
TTGTCCATCG TGACATCAAG GATGAGAACA TCCTGATAGA CCTACGCOGT GGCTGTGCCa 
AACTCATTQA TTTTGGTTCT GGTCCCCTGC TTCATGATGA ACCCTACACT GACTTTGATG 
GOACfiAGGGT GTACAGCCCC CCAGAGTGGA TCTCTCGACA CCAGTACC3VT GCACTCCOSG 
CCACTGTCTG GTCACTGGGC ATCCTCCTCT ATGACATGGT GTGTGGGGAC ATTCCCTTTG 
AOAGGGACCA GGAGATTCTG GAAGCTGASC TCCACTTCCC AGCCCATGTC TCCCCAGACT 
= CCAAACCTTC TTCCCQACCC TCACTGGAAG 

1140 

IV TGOTCftGAAG AGCCaVTCCCA TGOCCATGTC ACAGGGATAG ATGGAC»TTT 1200 

GTTGACTTGO TTTTAC»GGT CATTACCAGT CATTAAAOTC CAGTATTACT AAGGTAAGGG 1260 
ATTGAGGATC AGGOGTTAGA AGACa.TAAAC CAAGTTTGCC CAGTTCCCTT CCCAATCCTA 1320 
CAAAGGAGCC TTCCTCCCAG AACCTGTGGT CCCTGATTTT GGA6QGGGAA CTTCTTGCTT 1380 
CTCATTTTGC TAAGGAAGTT TATTTTGGTG AAGTTGTTCX: CATTTTGAGC CCCGGGACTC 1440 
TTATTTTGAT GATGTGTCAC CCCACATTG6 CACCTCCTAC TACCACCACA CAAACTTAQT 1500 
TCATAXGCTT TTACTTGGGC AAJSGCrrGCTT TCCXTCCRAT ACCCCAGTAG CTTTTATTTT 1560 
AGTAAAGGGA CCCTTTCCCC TRGCCIMGO TCCEATATTG GQTCAAGCTG CTTACCTGCC 1620 
TCAG02CAGG AETTTTTATT TTGGQGGAGG TAATGOOCXG TTGPrTACCXIC AAGGCTTCTT 1680 

■n ' rm " n " i " i ' tttttttttg ggtgagggga ccctactttg ttatcx:caag tgctcttatt 1740 

CTGGTGAGAA GAACCTTAAT TCCATAATTT GGGAAGGAAT GGAAGATGGA CACCACCOSA 1800 

CACCACCAGA CAATAGGATG GGATGGATGG TTTTTrGGGG GATGGGCTAG GGGAAATAAG 1860 

GCrrGCTGTT TGTTTTCCTG GGGCS3CTCCC ICCAATTTTG CAGATTTTTG CAACCTCCTC 1920 

CTGAGCCX3GG ATTGTCCAAT TACTAAAATG TAAATAATCA CGTATTGTGG GGAGGGGAGT 1980 

TCC3U«3TGTG CXXTTCXTTTTT TTTTCCTGCX: TGGATTATTT AAAAAGCCAT OTGTGGAAAC 2040 
CCACTATTTA ATAAAAGTAA TAGAATCAGA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 170 Protein sequence: 
Protein Accession #: NP_006866 

1 11 21 31 41 51 

I i I I I I 

MLTKPLQGPP APPGTPTPPP GGKDREAFEA BYRLGPLUSK G6F6TVFAGH RbTDRLQVAI 60 
KVIPRNRVIiG WSPLSDSVTC PLEVAI.I.WKV GAQOGHPaVI RUiJHIFBiaE OFMLVLERPL 120 
PAQDLFDYIT EKGPLGEGPS RCFFGQWAA K^CHSROW HROIXDEHZIi IDLSSGCAKIi ISO 
IDFGSGALLH DEPYTOFDOT RVYSPPEWIS RHQYHAI>PAT WTSLGIIiLYD MVCGDIPFER 240 
DQBILEAEI.K FPAHV8POCC AIiIBRCLAPK PSSRESIfBI IJJ3PIIMQTPA EOVTPQPIiQR 300 
RPCPFGLVLA TIiSUUtPGLA PNGQKSHntA MSQG 

Seq ID NO: 171 DBA sequence 
Hudeic Acid Accession ft: im_003646 
Coding sequence : 8 9 .. 2 87 5 

31 41 51 

I I I 

CCSCCX5GCCC GGCATGGGCG TCTCCCGCGG 60 
GGAGCCQCQG GACGGTAGCC CGGAGGCCC6 120 

QTCCAaaaac tcxxsagcgco Acooooarcc i80 

CAABCGGOGC TTCCOGG GO C TGCGGCTCTT 240 

CCTCCAGCAC CTGGCCCCCC CTCCGCCCAC 300 

GCAGATCXMG AGTACAQTGG ACTGGAGC6A 360 

CGAGACCAAC GTGTCOGGGG ACTTCTGCTA 420 

QCTGAAGTCA GTGTCTCGAA GAAAGTGCX3C 480 

CATCGAGCaO CTGGAGAAGA TAAATTTCOG 540 

CAGGAAT6TC CGCGAGCCAA CCTTTGTACX5 600 

CGGCAAGTGT CGGCACTGTG GQAAGGGATT 660 

GATTGTGGCC ATCAGCTGCT CGTGGTGCIVA 720 



GCGGOGCGGA GCGGGCGTGC TGAGCCCCGG 
GCCXTCC6CC OGCOGGGGCT AOGaCCGOAT 
QAGCAQOSAC TCOSAeTCGG CTTCOSCCTC 
GQAGCCGGAC AAGQOGCOSC QGOQACTCAA 
OGGGCKCAGQ AAAGCCATCA CCAASTCQGG 
CCCTGGGGCC CCGTGCAGCG AGTCAGAGOS 
aT£a«5CGACA TATGGGQAGC ACATCTGGTT 
OSTTGGGGAG CAGTACTGTG TAGCCAGOAT 
AGCCTGCAAG ATTGTGGTGC ACACGCCCTG 
CrOTAAGCCa TCCTTCCQTG AATC3W3GCTC 

CCAGCAGAAG TTCACCTTCX: ACAGCAAGGA 
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15 
20 
25 

30 
35 



GCrOGGGCTC 
CCAOAATACT 
CAAGAAAGGG 
GCTCATGAAG 
OATCATCCAG 
AGGGCCCAAG 
CGGGGGOGAC 
GCCaVCCCCCT 
CTGGG6TGQG 



CACGCAGCCG 
CTGAAAGCAA 
CCTGAGGAG6 

caxnacTGG 

TCTTTCCTCT 
GAGGCGCTGG 



GTTGCCATCC 1 
GGCTACACAG A 
CAGCTQGACC G 



TGCTCATCCC 
GCAASAAGAA 
GCCGCTGGA6 
TGTTTGTGAA 
GGTATCTCAA 
AGATGTACCG 
GCTGOATCCT 



GAAOAGGGCA 
ACCCTTCATC 
CCCCAAGAGT 
TCCCCOACAA 
CAAAGTGCAC 



3 CCOSGAGGCC 840 



ATCAGGCCCA 
GGGGGCAACC 
GTCTTCXaCC 
AACCTGCGGA 
GACCAGOTAC 
GACTTGGCCC 
CTCTCCCRCG 



CCCCCTCCCX: 
AGGGTGCAAA 
TGAGCCAGGG 
TCCTGGCGTG 



GAACCCTCAA 



CAGCCTGGGC TTTOACGCCC 
AQAQAAATTC RACAGCCX3CT 
CTTCCTGATG GGCAGCTCCA 
GGACTTGACT CCCAAGATCC 
CAGGTACTGT GCGGGCACCA 
CCftOCGGCAT GAOGACGGCT 



AOGTCACCCT 
TTOQOAATAA 
AGGACCTGGC 



CCCCCTGCRC 



GCOGCTGGGC 
TQAGAQACTC 
GTCXKCCAAG 
AGCCCAGGAG 
CCCTGAGCTG 



CTGOGCAACC 
AGCGACCAGC 
GACTATGAGG 
ACTGTGGTGG 



TGGTGCTTCC 
CACCTCAACT 
CIGGQGGCAT 
TGCTCaCCCA 



TGCCCTGGOG 
ACCTCGAGGT 
aCGAGCGGCT 



CCCTGCACTA 
TCrCAGGAGA 
CCXSATGGTGC 
TGGACGCCAC 
ATGTGACTGA 



GTTGOCCCTG GATGTCTTCA ACAACTACTT 
GGAGTTCCAC GAGTCTOGAG AGGCCAACCC 
GATGTTCTAC GCOGGGACAG CTTTCTCTGA 
CAAGCACATC CGAGTGGTGT GTGATGGAAT 
ACX:CCAGTGT GTTGTTTTCC TGAACATCCC 
CCACCXrTGGG GAGCACCACG ACTTTGAGCC 
CATTGGCTTC ACCATGACGT CGTTGGCCGC 
QACGCAGTGT CQOGAGGIGG TGCTCACCAC 



CAGTGACCTA 
TGGAGCCAAG 
CACTGCCAGC 
GATCGCACAG 



OGCCCCGGTC A 



CAGCTCAAGG 
GAGCTCTGCC 
TCCCCGACAT 
CGCTTCTACA 
GATGAGATTT 
ACCCCCACTT 



AGCTGGGGGC GACCTCATGC 
CAGCACTGGC AGCAAGQATG 
TGATGCGGTO QAGGAAAACG 
CACCATCTGC CACTACATCG 
CGACACTCCC CGGCAGCGGG 



TGQTCOGCTA 



CCTGCTGGAC 
TTTGCACCAA 
GGCCTOGCTC 
TCAGGACACC 



QAACCXSGCAG CACTACCAGA TGATCCAGCG GQAGGACCAG 



AAGCTCCAGG 
ACGCTCCTaC 
CACGCCCCCC 
GCAGCGGCCC 
ATGAAGACM3 
GAGCTGGCCG 
GAOACXSGCTG 



TGAGTCGCGT 
AGGCCTCTGT 
GTGCCCAC3VT 
GCCAGAAACT 
GGATCGACCG 
ATATCCTGGA 
CCOCTCTCCC 
CCCCTCAAGG 



ACCACGCAGT 
CAGAGATCCT 
TGGGCXAGCQ 



1200 
12 CO 
1320 
1380 
1440 
1500 
1560 



1740 
1800 
1860 



2160 - 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

3820 

2880 



40 
45 
50 
55 
60 
65 
70 
75 
80 



MEPROGSPEA 
GLQEIAPPPP 
mXSVSKRKC 
DGKCRHOGKG 
PPTWILRAHR 
NPKSaGNQQA 
LSTU3QUIUK 
IJIAEPKPEAG 
KMPYAGTAPS 



aackiwhtp 
fqqkftphsk 
pqntijKaskk 
k1iqsft.kyi. 

PPPPVAILPIi 



RQIRSTVDWS 
CI^LEKINF 
BIVAISCSWC 



NPRQVFDLSQ 
GTGNDUUITL 
RLPLDVFNNY 
AXKIRWCDG 



GEPCKLAASR 



PQRHDDGYLE 
IRIALRNQAT 
VPLGTVWPG 
3 RAQBHLmVT EIAQDEIYIL DPELLGASAR 



NWGGGYTDEP 
FSLGFDAHVT 
MDLTPKIQDL 
AJbOVG(«GER 
APUISDQQPV 



I 

NKRRFPGLRL 
FETNVSGDFC 
SRNVREPTFV 
FMIiQQIEEPC 
RPFIIRPTPS 
RKVHNLRIIA 
VSKII.SHVEE 



FGHRKAITKS 
YVGEQYCVAH 
RHHWVHRRRQ 
SLGVHAAWI 



CXSGDGTVGWI 
GNWQLDRWD 
PEKFHSRFRN 
PRYCAGTMPW 
TSKAIPVQVD 



2 RTICHYIVEA 



AQDTBIAAyii E 

Seq ID NO: 173 DNA sequence 
Nucleic Acid Accession AF232772 
Coding sequence: 1-1662 



AGAKSPTCQK I.SPKHCFU1A 
FOLPTPTSPL PTSPCSPTPR 
EQSRTLLHKA VSTGSKDWR 
GASUVCTDQQ GDrrPBQBABK 



21 



31 



51 



I I I I 

ATGCCGGTGC AGCTGACGAC AGCXXTTGCXTT GTGGTGGGCA CCAGCCTGTT TGCCCTGGCA 
GTGCTGGGTG GCaTCCTGGC AGCCTATGTG ACGGGCTACC AGTTCaVTCCA CaCXSGAAAAG 
CRCTACCTGT CCTTCGGCCT GTACX3GCGCC ATCCTGGGCX: TGCACCTGCT CATTCSWSAGC 
CTTTTTQCCT TCCIGGAGCSl CCGGCGCATQ CGACGIGCOG QCCAGGCCCT GAAGCTGCCC 
r GGCACrGTGC ATIGCXiGCAT ACCAGGAGGA CCCTOACTAC 



GGTGAGACGG 
AGCACCTTCT 
TTCAAGGCCC 
GATCCAGCCT 



CAGTSTATTA 



ACCAACXGAG 



TACTTCCGG6 
TACGAGTCaO 
TTCTACCGGG 
A7TATCAAGG 



AGGCCAGCCT 
CKTGCATCAT 
TOGGCGATTC 



GTGGGCCCTT 
ATCAQAAGTT 
TCCTGAGCXrr 
CCACTAA6TA 
AGTGGCTCTA 



GCCGCMCTG 



GCAGAAGTGG 



CAACGTGGAG 
GGOCATQTAC 
CCTAGGCAGC 
TGGCTACCGA 
CCTCtXSQTGa 
CAACTCTCTG 
TTTCTTCCCC 
GAACATTCTC 
CTCCTTCCTT 



TACATGCTGG ACATCTTCCA OGAGQTGCTG ' 420 

TGGCGCAGCA ACTTCCaTGA GGCAGGCGAG 480 

ATGGACCXSTG TGCGGGATGT GGTGOSGGCC S40 

GGAGGCAAGC GCGAGGTCAT GTACACGGCC 600 

ATCCAGGTGT GOGACTCTGA C3VCTGTGCTG 660 

GTCCTGGAGG AGGATCXXX:A AGTAGGGGGA 720 

TACGACTCAT GQATTTCCTT CCTGAGCAGC 780 

OQG6CCT6GC AGICCTACTT 1GGCIGT6TG 840 

CQCAACnGCC TCCTCCAGCA GTTCCTGQAG 900 

AAGTGCAGCT TGGGGGATGA CCGGCACCTC 960 

ACXAAGTATA CCGCX3CX3CTC CAAGTOCCTC 1020 

CTCAACCAGC AAACCCX3CTG GAGCAAGTCT 1080 

TGGTTCCATA AGCACX»CCT CTGGATGACC 1140 

TTCTTCCTCa TTGCCACX3GT TATACAGCTT 1200 

CTCTTCCTGC TGAOGGTGCA GCTGGTGGGC 1260 

0GGGGCAAT6 CAGAGATGAT CTTCATGTCC 1320 



252 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



WO 02/086443 

CTCTACTCCC TCCTCTATAT 
ATCAACAAAT CTGOCTGGGG 
CTCATTCCTG TGTCCATCTG 
TGCCAGGACC TGTTCAGTGA 
GGCraCTACT GGGTGGCCCT 
AAGAAGCCGG AGCAGTACAG 
CGGGTAAAGT GCAATGGGTA 
GGGAGGAtSGG AGTGCTGTOT 
AAGAACGGTG ATGTAGTATG 



OTCCAOCCIT CTGCCGGCCA AGATCTTTGC CATTGCTACC 
CTTCATTGGC 
CACAGCTTAT 
TATACTQTAT 
GGGAT6TGGG " 
CCAAGCAGAG 



PCT/US02/12476 



GGTGGCAGTT 
GACAGAGCTA 
CCTCATGCTA 



AGGGAGGGAA 
TTTAGTCTCI 
GCTTGACAGC 



GCCTTCCTTG .TCTCTGGGGC 
TATCTOGCCA TCATCGCCGG 
GCTGAGGTQT QACATGGCCC 
GGGGAATGGA AGAGAAAAGA 
TAATGGTCCA AAGGACWUVT 
TCTGTTTAGA GGAGGCAACA 
TCAGACTGCC TGTCTGCTTO 
GOCACTCAGA AGTTGTGCTA 



CTAAAATGCA 
CTGATCCCCC 
CATCIGCACA 
AACCAA6TTA 



AGTCCCATTC 
TTCTCCCAGC 
GGCCGGTTAG 
GCGCGTGAGA 
ATGQTGCTTT 
CAGGGAGTTA 
GGAACAAAGA 
TTCCACCTGG 



TAACCTGTAT 



CCCTCATCAT 
TAGAAGCTTC 
TGATCARATT 
ATGTGTGACT 
TCCCAAAGTQ 



TCCASAAACC 
CTCATOCCTC 
QAAGCCATTT 



OCATCTGAAC 
TGTATGTCAC 
TACAAGGCCC 
ATGTGAGATA 
GCACTGAACr 
GATTGTGGTG 
AAACTGCrCA 
AAAOATTAAG 
ACTCTTGAAT 
CTTTCTTCAA 
CATAGGTAAG 
AGCAGGAGGC 
GGCTACAATC 
TTCAGGCTAC 
AACTCTCAAA 
GGGAATTCTT 
AAACTAGQAG 



CCCCUCCCCA 
AGAAGCCTGA 
CCCCACTCC3V 
GCTTTTAAAA 
GTGCTAAAGG 
OACGTCTAOA 



GTGGCAGGAG 
CCCATAAGTA 
TCTTTGGGCA 



TCCTCTCAAA 
AGCCACATTT 
GTTTTCAAGG 
AAGCGTGTTC 
TTGGAGCTGC 



TTCAGCTCTG 



TGGCAATTGG 
TCAGCACATA 
TTGQAaSGAT 
AATCATCTCC 



AATTTCTACT 
GTCATC3UITG 
TCAGAAAACA 
CCAGGGATGA 
AAAAGGAAAG 
CTACACAGAG 
GCTTGTCT6T 
AAOTQAAGIC 
ATCTOAGaCT 
CACTGCAGTC 
GGCGGAGCCC 
TGGGAACTAT 



TTTGCCAGGA 
GCCTTGGGTG 
OATCrCTQCT 



1860 
1920 
1980 
2040 
2100 
2160 

2280 
2340 
2400 



TCAAAGGGGC CAACTAACCC 
AAGCCTCTAA TGTACCAAGT 
GOOCAAACCC 
AACAGTTGCC 
CCAACCCATT 
6TTCTGCIGQ 
CAAOTGCAOA 6TTCAGACTT 
AATGTAaQAT 
GTCAACTTTC 
TCGACTGGTT TTTCTAAGTT 
CAGCTTTATC CCCGTTTCTT 
GAGAATTCAA 
CCTGCTTTTT 
AAAATAAAGA 



TGTGACAGTC 
TTCCTACATC 
TTTCCCTCTG 
C TTGCAATCCA GGCTGTTCTC 
GTTGCTTGCT 
AGCACTAAGG 
AA6A0GCAAG 
CACGAACTCA 
TTCTGGTAAG 
CATTTGCTAA 
6AGCATTAGA 



AATGGAAAGC 
GACCATCCAT 
ACTGCATTTa 
CTAAGOTTCT 



AAGACACACT 2580 
ACCTCTTCTA 
OGGCTTCTTA 
6AGGAGCCTC 
CXX3GGTTAGC 
TTTTCAGTGT 
TCTCCTCMT 
CCTGCTTCTT 



GTGCAGAACC 
GCTTCCTACA 
CTT CATCTCC 
GACCTCCAGT 
AGGCAGCAGG 
AAACTATTTT 
CGCTAAGGGC 
CCTGGCTGTC 
AAACAGGCAO 
ATTTTGTACA 
GCAAGGGAAG 
ATCCTCTTTT 



TTCTCACAT 



CTACCATCAG 
GTACAGGTAG 
TTTTTCAGCA 
AGCCTTTATA 
GTATTGTTTC 
TAATTTTCAG 



AGCSTTTTQA 
TGCCCTCCRA 
TGGACAGCAG 
GTGTGCTCCG 
AGGGTTTTCC 
ACTGCTGGTT 
GGCAGCTGRT 
ATGGAGGAAA 
TCAGCa^TTTA 
CTCTGCCTTQ 



GCAAAACCAA 
CAATTGGAOG 
TACAATAATT 
GTCAAGTTTT 



OTTTAAAACC 
ATGTCCTTTC 
ACAAGAGGGC 
AACC3«3«3AT 
AAGTGTAGCT 
GACATCAGAC 
CCAGGCAATC 
TCCGGTCAGC 
CTTGAAGATT 
CACTGTGGTC 
ACGTTTGACC 
ACTGQQTCTT 
CRTTTTGQTT 



2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



TTATACI6CA 



3650 
3720 
3780 
3840 
3900 
3960 
4020 



KPVQLTTALR 
LPAFLEHRRM 
WDGNRQBDA 
STFSCIMQKW 
VGGDVQIUnC 
DWYHQKFLGS 
yPREKLVNSL 
IIKATYACFL 
LIFVSIMVAV 



WGTSLFAIA 
RKAGQALKLP 
yMLDIFHEVI. 
GGKREVMYTA 
TOSWISFI.33 



WFHKHHIiWMT 



21 
I 

VLGGIIAAYV 
SPRRGSVALC 
GGTEQAGFFV 
FKALGDSVDY 
VRYWVSRFHVE 
TNRVLSLGYR 
YESWTGPFP 
LYSLLyMSSIi 
OQDLFSBTBL 



TGYQFtHTEK HYLSFGLYGA 
lAAYQEDPDY LRKCLRSAQR 
WRSHFHEAGB GBTEASLQEG 
IQVCBSDTVL DPACTIEMLR 
RACQSYFGCV QCISGPLOIY 
TKYTARSKCL TETPTKYIiHW 
FFLIATVIQIi FYRGRIWNIIi 
I.PAICIFAZAT 
AFIiVSQAII>Y 



OCYHVALUOi 



ILGIflbLIQS 
ISFPOUCWM 
MDRVRDWRA 
VLEEDPQVGG 
RNSLLQQFLE 
LHQQTRHSKS 
LFIJjTVQI.VG 
RKTIWHFIG 
YUVIIARROa 



Seq ID NO> 
Htacleic Acid Aceesaion »: HM_000691 



70 1 



75 
80 
85 



CCRGGAQCCC 
GC0GT6AAGC 
TTCCAGCAGC 
GCGCTGGCCG 
GTCCTAGAGG 



GTGGTCCTCG 
GGOGCCATGG 
GOQAGCCTGC 



TGGAGGCGCT 
CAGACCTGCA 
AQATCXSAGTA 
CGCCCCAGAC 
TCATTGGCAC 



TGGCTACCKT 



21 
1 

OAGAGQCTOT 
CGCCTTCAGC 
GCAGCGCCTQ 
CAAGAATGAA 
CATGATCCAG 
TCAGCAGGAC 
CTQGAACTAC 
OGCAGTGGTC 
CATCCCCCnO 



OTCAAAGGOG 0CAT6AGCA& GATC3U3CQAG 
TCGGGCAGGA CXXOTOCGCT GCAOTTCCGA 



ATCCAGGAGC 
TGGAACX3CCT 
AAGCTCCCTG 
GAGCTCTACA 
OCCTTCAAGC 
CTCAAGCCCT 
TAOCIGGACA 



AGGAgCAGGA 
ACTATGAGGA 
AGTGGGCCGC 
TCCACTCGGA 
TCACCATCCA 
OGQAGCTGAG 



GTCA06CTGG 
GTGGCCXGCC 
CCAGACTACA 
TCACTGAAAG 



OGGGGGTGGG QAAGATCATC ATGACGGCTG CTGCCAAGCA 
AGCTGGGAGG GAAGAGTCCC TGCTACGTGG ACAAGAACTG 
GACGCATCGC CTGGGGGAAA TTCATGAACA QTGGCCSVGAC 
TCCTCTGTGA CCCCTCGATC CAGAACCAAA TTGTGGAGAA 
AGTTCTACGG GGAAGATGCT AAGAAATCCC GGGACTATGG 
ACTTOCAGAG GGTGATGGGC CTGATTGAGG GCCAQAAGGT 
ATOCOGOCAC TCGCTACATA GCCCCCACCA TCCTC31CGGA 



GGTGGTGTAC 
GGATGAGCCC 
GCCACTGGGC 
GCCCATGGTG 
TGAGAACATG 
CCCAGTAATC 
TATCCTGTAC 
CCTGACCCCT 
TGACCTGGAC 
CTGCGTGGCC 
GCTCAAGAAG 
AAGAATCATT 
GGCXTATOGG 



wo 02/086443 

CAGTCCCCGG TGATGCAAGA GGAGATCTTC GGGCCTGTGC TGCCCATCGT GTGCGTGOGC 1080 

AGCCTGGAGO AOSCCATCCA GTTCATCAAC CAGCGTGAGA AGCCCCTGGC CCTCTACATG 1140 

TTCTCCAGCR ACGACAAGGT GftTTAAGAAG ATGATTGCAG AGACATCCAG TGGTGGGGTG 1200 

GCGGCCAAGG ATGTCATOGT CCAO^TCACX: TTGCACTCTC TGCCCTTCGG GGGCaSTGGGG 1260 

AACAGCGGCA TGGGATCCTA OCATOaCAAa AAQAGCTTCXj AGACTTTCTC TCACCGCCGC 1320 

TCTTOCCTGG IGAGGCCTCT QATGAATGA.T GAAGGCCTGA AGGTCAGATA CCCCCCGAGC 1380 

CCGGCCAAGA TGACCCRGCA CTGAGGMGGG GTTaCTCCGC CTGGCCTGGC CATACTGTGT 1440 

CCCATCGGAG TGCGGACCAC CCTOVCTGGC TCTCCTGGOC CTGGAGAATC GCTCXTTGCAG 1500 

CCCCAGCCCA GCCCCACTCC TCTGCTGACC TGCTGACCTG TGCACACCCC ACTCCCACAT 1560 

GGGCCCAGGC CTCACCATTC CAAGTCTCCA CCCCTTTCTA GACCAATAAA GA6ACAAATA 1620 
CAATTTTCTA ACTCGG 

Seq ID NO: 176 Protein seguencei 
Protein Accession ft: np_000682 

1 11 21 31 41 51 

I I I I 1 I 

M8KISEAVKR ARAAFSSGRT RPLQFRFQQL EALQRI.IQEQ EQELVGALAA DLHKNEWNAY 60 

YEBWYVLEE lEYMIQKLPB WAADEPVEKT PQTQQDBLVI HSEPLGWLV IGTWHYPFMI. 120 

TIQPMVGAIA AGNAWLKPS ELSENMASLIi ATIIPQfYIiDK DIiYPVINGSV PBTTBLLKEB 180 

FDHILYTGST GVGKIIMTAA AKHLTPVTLE LGGKSPCSfVD KMCDLOVACR RZAHCaCFMIlS 240 

GQTCVAPDYI LCDPSIQNQI VEKLKKSLKE FYGEDAKKSR DYGRIISARH FQRVHGLIEG 300 

QKVAYGGTGD AATRYUVPTI LTDVDFQSPV MQEEIFGPVL PIVCVSSIiBB AIQFINQREK 360 

PLALYMFSSN DKVIKKMIAB TSSGGVAAND VIVHITLHSL PFG6VGNSQH GSYHGKKSFE 420 
TFSHRRSCIiV RPLMNDEWLK VRVPPSPAKM TQH 

Seq ID NO: 177 DNA sequence 

nucleic Acid Accession St IIM_001067.1 

Coding sequence! 108-4703 

1 11 21 31 41 51 

111)11 

CTAACCQACG OGOGTCTGTG GAGAAGCGGC TTGGTCGGGG GTGGTCTCGT GGGGTCCTGC 60 

CTSTTTAQTC QCTTTCAGGG TTCTTCAGCC CCTTCACGAC CGTCACCATG GAAGTGTCAC 120 

CATTGCAGCC TGTAAATGAA AATATGCAAS TCAACAAAAT AAAGAAAAAT GAAGATGCTA 180 

AGAAAAQACT aTCTGTTGAA AGAATCTATC AAAAGAAAAC ACa^TTGGAA CAT ATTT TGC 240 

TCCX3CXX»QA CSVCCTACRTT GGTTCTOTGO AATTAGTGAC CCAOCAAATQ TGGGTTTACG 300 

ATGAAGATGT TGQCATTAAC TATAGGSAAia TCACTTTTOT TCCTGGTTTG TACAAAATCT 360 

TTGATCAGAT TCTAGTTAAT GCTGGGOACA ACAAACAAAG GGACCCAAAA ATGTCTTQTA 420 

TTAflAGTCAC AATTOATCOS GAAAACAATT TAATTAQIAT ATGGAATAAT GGAAAAGGTA 480 

TTCCTGTTGT TGAACACAAA QTTCAAAAGA TGTATGTCCC AGCTCTCATA TTTGGACAGC 540 

TCCTAACTTC TAGTAACTAT GAK3ATGAT6 AAAAQAAAGT GACAGGTGGT CX3AAATGGCT 600 

ATGGAGCCAA ATTGTGTAAC ATATTCAGTA CCAAATTTAC TCTGGAAACA GCCAGTAGAG 660 

AATACAAGAA AATGTTCAAA CAGACATGGA TGGATAATAT GGGAAOAQCT GGTGAGATGG 720 

AACTCAAGCC CTTCRATGGA GAAGATTATA CSITGTATCAC CTTTCAGCCT GATTTGTCTA 780 

AGTTTAAAAT GCAAAOCCTG GACAAAGATA TTGTTGCACT AATGGTCAGA AGAGCATATG 840 

ATATTCCTGG AT0CACC3UUV QKtmCMMS TCTTTCTTAA TGGAAATAAA CTGCCAGTAA 900 

AAGOATTTOS TAGTTATGTG GAC&TGTATT TGAAOQACAA GTTCGKTQM ACTGGTAACT 960 

CCTTGAAAGT AATACAT6AA CAAOTAAACC ACAGaTGGOA AC3TGTGTTTA ACTATGAGTG 1020 

AAAAAQGCTT TCAGCAAATT AGCTTTGTCA ACAGCATTGC TACATCCAAG GGTGGCAGAC 1080 

ATGTTGATTA TGTAGCTGAT CAOATTGTGA CTAAACTTGT TGATGTTGTG AAGAAGAAGA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACATC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 1200 

ATOCCTTAAT TGAAAACCCA ACCTTTGACT CTCAGACAAA AGAAAACATO ACTTTACAAC 1260 

CCAAGAGCTT TG6ATCAACA TGCCAATTGA GTGAAAAATT TATCAAAGCT GCCATTGGCT 1320 

(SIGGTATTGT AQAAAGCATA CTAAACTGGG TGAAGTTTAA GGCCCAAGTC CAGTTAAACA 1380 

Aim^ACnXSTTC AOCTOTAAAA. CKTAATAOAA TCAABCMAAT TCCCAAACTC GATGATGCCA 1440 

ATGATGCAGQ GGGCCQAAAC TCCM::TGAaT OTAOJCTTAT CCTGACTGAG GGA GATTC AG 1500 

CCAAAACTTT tSSCTGTTTCA GSCCTTGOTG TGGTTOGGAG AGACAAATAT GGGOTTTTCC 1560 

CTCTTAOAGG AAAAATACTC AATGTTOCSAa AAGCTTCTCA TAAGCAGATC ATGGAAAATG 1620 

CTORGATTAA CRATATCATC AAGATTGTGG GTCTTCAGTA CAAGAAAAAC TATGAAGATG 1680 

AAGATTCATT GAAGACGCTT CGTTATGGGA AGATAAT6AT TATGACAGAT CAGGACCAAO 1740 

ATGGTTCCCA CATCAAAGGC TTGCTQATTA ATTTTATCXaV TCACAACTCG CCCTCTCTTC 1800 

TGC6ACATCQ TTTTCTGGAG QAATTTATCA CTCCCATTGT AAAGGTATCT AAAAACAAGC 1860 

AAOAAATGOC ATTTTAC3U3C CTTCCTGAAT TTX5AAGAGTG GAAGAGTTCT ACTCCAAATC 1920 

AXAAAAAATC QAAAGTCAAA TATTACAAAG GTTTGGGCAC CAGCAC3VTCA AAGGAAGCTA 1980 

AA6AATACTT TGCAGATATO AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCT6AAG 2040 

ATGATGCTGC TATCAGCCTG GCCTTTAGCA AAAAACAGAT AQATGATCGA AAGGAATGGT 2100 

TAACTAATTT CATGGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT GAGGATTACT 2160 

TGTATGGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAG GAACTTATCT 2220 

TGTTCTOUIlA TTCTOATAAC GAGAGATCTA TCCCTTCTAT GGTGGATGGT TTGAAACCRG 2280 

GTCRGAGAAA GGTTTTGTTT ACTTGCTTCR AACGQAATOA CAAGCGAOAA GTAAAGGTTG- 2340 

CCCAATTAGC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCAXGGTGAG ATGTC»CTAA 2400 

TGATGACCAT TATCAATTTG GCTCAGAATT TTGTCGQTAG CAATAATCTA AACCTCTTGC 2460 

AQCCCATTGG TCAGTTTGGT ACCAGGCTAC ATGGTGOCAA QQATTCTGCT AGTCCACGAT 2520 

ACATCTTTAC AATGCTCAGC TCTTTGGCTC QATTGTTATT TCCACCAAAA GATGATCACA 2580 

CeTTCAAGTT TTTATATGAT GACAACCAGC GTGTTGAGCX: TQAATGGTAC ATTCCTATTA 2640 

TTCCCATGGT GCTGATAAAT GGTGCTGAAG GAATCX3GTAC TQGGTGGTCC TGCAAAATCC 2700 

CCAhCTTTGA TGTGQQTOAA ATKJTRAATA ACATOVCGaB TTTGATGGAT GGAGAAGAAC 2760 

CTTTGCCAAT GCTTCCAAGT TACAAGAACT TCAftOGOrac TATTGAAOAA CTGGCTCCAA 2820 

ATCAATATGT GATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAAOCATT GAA ATCTCAG 2880 

AGCrrCCCGT CAGAACATGQ ACCCAGACAT ACAAAOAACA AGTrCTAOAA CCCATGTTGA 2940 

ATGGCACCGA GAAGACACCT CCTCTCATAA CAGACTATAG GGAATACCAT ACAGATACCA 3000 

CTGTGAAATT K3TTGTGAAG ATGACTGAAG AAAAACTGGC AGAGGCAiGAG AGAGrTGOAC 3060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTGACC 3120 

ACGTAGGCTG TTTAAAGAAA TATGACACGG TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCASACTTAA ATA.TTATGGA TTAAOAAAAG AATGGCTCCT AGGAATGCTT GGTGCTGAAT 3240 

CIGCTAAACT GAATAATCAG QCTOQCTTTA TCTTAGAGAA AATAGATGGC AAAATAATCA 3300 



254 



GAATTAATTA A 



5 
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65 
70 



WO 02/086443 

TTGAAAATAA GCCTAAGAAA 
ATCCTGTGAA GGCCTGGAAA 
AQAGTGACAA 
TCAACTATCT 

GCAOCXTTAAO AAAXGAAAAA 
ATTTGTGGAA AGAAGACTT6 
AAAAACAAGA TGAACAAGTC 
CACAAATGGC TGAAGTTTTG 
TAGAAATOAA AGCAGAGGCA 
AAOSAAGCCC TCAASAAGAT 
AACAOAAAAG AGAACCAGGT 
TCAAAAAAGG AAAGAAOAOA 
AAAGTAATTT TOATGTCCCT 
AATTCACAAT GGATTTGGAT TCAGATGAAG 
ATGAAGATTT TGTCCCATCA 
GTAACAAAGA ACTGAAACCA 
AQGGCAGTGT ACCACTGTCT 
ITACAAACCC AOTTCCTAAA 
CTTCCACCTC 



ATTTAACCAA 
TGGACACATT 
TTGAAGAATT 



GGAAAAGAAA 



GGTGTGGAAC 
ACAAAOACAA 
AATCCCTGGC 



GTGGTCAAAG 
ATAAAAAGAA 
TAGAAGGCCr 



GGAGGCTGTT 
GAAGGCCAAG 
AGTCATTCCA 
AATTAAGAAT 



GAAAATQAAG 
GSACCAACCT 
GATGAACICT 
AGTCCATCAG 
6AAGCCAAGG 
GOGAAAAAAA 
CX3AATAACCR 
GAAAATACTO 



CAGAAAAGTG 
TCAAGCCXnx: 
AAGAATGTGA 



CA6AGCCACG 
ATTTCTCAGA 
CACCTAAGAC 
TCGTGTCAGA 
CTGCTACACA 
CAGTGAAGAA 
GGGCTGCCCC 



TACATT6GCA TTTAAGCCAA 
ATCASATAGG A6CAGTGAC6 
GAGAGCAGCA ACAAAAACAA 
TTTTGATGAA AAAACTGATG 
CAAAACTTCC CCAAAACTTA 
GATGATGTTA 
GAAACTGAAA 



TTTCCCAGAT 
GAGAGCAGCA 
AAAAGGAACT 



CAOCTTTGAA TTCTGOTGTC TCTCAAAAOC CTOATCCTGC CAAAACX»AG 



AAAOOAAGCX; 
CAQTCACAAG 
CTGTGGCTCC 
CAGATGAAGA 
GCCCAAGACT 
TGGGGAAGGT 
TTTTTATAAT 



T6TAGAAATA 
AGCTAAAACT 
AOATATOAGA 
GATAGAACTT 



ATCCACTTCT 
CAAGAAATCC 
TCGGGCAAAA 
TGATCTGTTT 
GGTTTTAAAG 
GTTTTTAGTA 
ACTGTCTAAA 
GAGTCTGCTT 
GCTATCTGAT 



TCTGTACGGG 
TAAAATGTGA 
TTACCTGAAG 
CAAGACATCA 
TAGTGACCAT 



AATG6COGCA 
GTTTCXaAAAa 
TTTGACTCAG 



AAAGAAAAAC 
QAGTAGTTAT 
ATCTCCCAAA 
CATTTOATCC 
TTCRTTTTGa 
OTTATATGTG 
TCTATTAGCT 



I 

MBVSPLQPVN 
MWVYDEDVGI 
NGKGIPWEH 
TASRBYKKMF 
RRAYDIAGST 



TAOAGCATAA 
TATGGTTCTA 
AGCAATGAGA 
ACTTTGGCTG 
GTGATTATTT 

AAGATCTTAA 
QAAATCTCCA 
TGT6ACTTGA 
AAATTCCAAC 



GATTTAAAAG 
TTATCTGTTT 
GTACAGATAC 
AATTGCTCAT 
TGTCTATAAC 
CAGCTCTTGA 
AATTTCTAAQ 
ATGTTATATT 



GGCGATTATT 
CTCTTAACTT 
AAGTGAAGTA 
CTCATGGGCA 
TTAAAACCTG 
TAAAGCAQTG 
TGTCACTCTT 



CCTCXrCCTCT 
AAGCCCAAOT 
TTGTTTTCTT 
ATTTTTAAGT 
TGTTTATTAA 



TATCTTAOTT- 
TCTACTACAC 
GTTCTTCATC 
TTGACACAGT 
CCTGTCCCCT 
AGQACTGOAT 
GATAAOCATQ 
TTGTAAACTT 
CAACGTTTTT 
TTTAATAAAA 



TTATACATAA 



TTCTCAAATC 
CAATAQAATG 
CTGGCTGCCT 
TGCAGAAGAC 
CTCAOCAATQ 
TGTTAAGACC 
GTAAATATTT 



ATCTTACCAA 
GAATTTAGTT 
GTTCTTTAGC 
CTCTGCTTTG 
TCTTCTGAAC 
CC ATCC ACTA 
TACTTTCA6T 
TTTACCATCA 
ATGT6CCAAG 
ATCAGAGGCC 
AAGAAAATTA 
CTGAGTCTGA 
TCGGGGACAA 
AGCTATTAiQA 
TGTCTACATT 
ACTATOTTTT 
ATTGC 



3360 
3420 
3480 
3540 
3600 

3seo 

3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 



4920 
4980 
5040 
5100 
5160 
5220 
S2S0 
5340 
5400 
5460 
5520 
5580 
5640 
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WIPVNAI. 

VQUIKKCSAV 

YOVFPLRGKI 

OQDQOGSHIK 

STPNHKKWKV 



GLKPGQRKVI. 

YIPIIPKVIiI 
BLAPNQYVIS 
HTDTTVKPW 

GVDSDPVKAW 



NYREVTFVPG 
KVEKMYVPAIi 
KQTHMDNMOi 
KDVKVFUIOJ 
ISFVNSIATS 
PTFDSQ1 
KHNKIKQIPK 
LNVREASHKQ 
GI-I.XNF1HHII 
KYVKGLGTST 
DRRQRKLI.GL 
PTCFKRNDKR 



NE33AKKRLSV 
LYKIFDEILV 
IFGQLLTSSN 



51 



GSVAIUrSTT 



KLPVKGFRSY 
KGGRHVOYVA 
MTLQPK8PGS 
LDDANDAG6R 
IMEHAEUWI 
WPSIiLRdRFIi 
SKEAKEYFAO 
PEDYI.YGQTT 
EVKVAQLAGS 
ASPRYIFTTOi 
SOCIRfFDVR 
lEISELPVRT 



I I 
ERIYQKKTQL BHILLRPDTY IGSVELVTQQ 
NAADNKQRDP KMSCIRVTID PENNLISIWN 
YDDDEKKVTG GRNGYGAKLC NIFSTKFTVE 
GEDYTCITPQ PDLSKFKMQS LDKDIVALMV 
VDMYLKDKLD BTCaiSLKVIH EQVHHRWEVC 



TOQLSBKFIK 
HSTBCniIt.T 
IKIVGr<QYKK> 



NYEOBOSLiCr LRYGKIMIMT 
SKNKQEMAFY SLPEFEEWKS 
SGPEDDAAIS LAFSKKQIDO 
XELILFSNSD HERSIPSMVD 
EMSLMMTIIN LAQHPVQSNN 
SStiAItIJ.FPP KCDBTUCFbY DDNQRVEPEW 



KGKKTQMAEV 



QUUCEWliLGM 
KEAQQKVPDE 
KEQEIiDTIiKR 
I.PSPRGQRVI 



LGAESAKLNK 



KSPSDLWKED 
PRITIBKKAB 
AFKPIKKGKK 



NTQTYXEQVL 
LQTSLTCaiSM 
QARFII.BKIO 
ETBKSDSVTD 
LATFIEELEA 
AEEOOIKKKIK 



KYDTVLDILR 
GKIIIENKPK KBLIKVLIQR 
SGPTFNYUJJ MPLWYLTKKK 
VEAKEKQDBQ VGIiPGKGGKA 
NENTBGSPQE DGVELBGLKQ 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



75 



Seq lO NOi 179 DtfA sequence 
Nucleic Acid Accession #: Eoe 
Coding sequence: 148-7095 



11 



21 



31 



41 



SI 



111.. 
CACACATAOG CAOGCAOSAT CTCACTTOSA TCTATACACT GGAGGATTAA AACAAACAAA 
CAAAAAAAAC ATTTCCTTCG CTCXXCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 
CGGCGAGGGG CCQCAGACCG TCTGGAAATG OOAATCCTAA AGCGTTTCCT CGCTTGCATT 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 
CTTOTTGAAG AGATTGGCXG GTCCTATACa GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 
AAATATCCAA CATGTAATAG CCC»AAACAA TCTCCTATCA ATATTGATGA AGATCITACA 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTrCAGGGTT G6GATAAAAC ATCATTGGAA 
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AACACATTCA TTCftTAACAC TGGQAAAACA GTGQAAATTA ATCTCACTAA T<3ACTACCX3T 480 

GTCAQCGGAG CAeTTTCAGA AATGOTGTTr AAAOCAAGCA ASATAACTTT TCACTGGC36A 540 

AAATGCAATA TSTCATCTGA TGOATCAQAG CATAGTTTA6 AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TQATGCGGRC CSATTTTCAA GTTTTQRGGA AGCAGTCRRA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTOTTTOAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCXaUV AGT6TTAGTC GTTrTCGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCrTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAQCA TCTCTGAAAQ CCAGTTGQCT GITTTTT G TG AAOTTCTTAC AA1GCAACAA 960 

TCTGOTTATG TCATGCTGAT GGACTACTTA CAAAMAATT TTCQA6AGCA ACAGTACAAG X020 

TTCTCTAOAC AOGTGTTTTC CTCATACACT GGAAAGGAAQ AOATTCATGA AGCAGTrrGT 1080 

AGTTCAGAAC CAGAAAATQT TCAGGCTaAC CCAGAOAATT ATACCAGCCT TCTTGTTACA 1140 

TGGQAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCRG 1200 

CAQTTGOATG GAGAGGACCA AACC3iAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 12S0 

GGTQCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAQAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTQAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGSAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGOGCT ATTGTGAATC CTGOTAGAQA CAGTGCTACA ISOO 

AACCAAATCA GOAAAAAGOA ACCCCAGATT TCTACKACAA CACACTACAA TCaCATAGGG 1560 

ACGAAATACA ATGAAGCCRA GACTAACOOA TCCCCAACAA OAOaAAGTGA ATTCTCTGOA 1620 

AAGGOTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC RACCACTCRC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTO AACTGCX3«X: TCACACTGTG 1740 

GAASGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCyWSTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

QAAAACCCAG AQACAATAAC ATATOATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

6AAGATTCAA CTTCATCAGO TTCAGAAOAA TCaCTAAAGG ATCCTTCTAT GGAGGOAAAT 2100 

QTGTGGTTTC CTAGCTCTAC AGACATAACA GCAOWSCCCG ATGTTGGATC AGGCAGAQAG 3160 

AGCTTTCTCC AGACTAATTA CACTGAOATA C0TGTT6ATO AATCTQAOAA QACAACCAAG 3220 

TCCTTTTCTG CSlGGCCCftGT GATGTCACAG GGTCCCTCAG TTACAQATCT GGAAATGCC3V 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAQACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACGCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TCGGCCTTGC ATGCTACGCC TQTATTTCCC AGTGTCOATG TGTCATTTGA ATCXATCCTG 2580 

TCTTCCTATG ATOGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTQAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCARATC CTTCCACAAG TTACITCRGC TACaGAGAQT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGOt» GTGGCrOGOa QTGATTTGCT ATTAGAGCCC 2760 

AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGOTAGTa AATCTGGTGT TCTTTATAAA ACXSCTTATGT TTTCTCAAGT TGAACCACCC 2880 

AGCASTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCXTGTCT 2940 

QATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCT6TGG GTGTAACTTA TCAGOGTTCC TTATTTAGCX3 GCCCTAGCCA TATACCaVATA 3060 

CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGQ AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACXTT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTO AAATAATATA TGQAAATGAQ 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTCAAAG CACAOTCATG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGaJTCTT TAC3U«1AAAC CTCTG TTTCC 3420 

ATTTCTAQCA CC3U«3GGCAT GTrXCCAGGG TCCCTTOCTC ATACCACCAC TAAGOTTTTT 3480 

QATCATGAOA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACT6TC 3S40 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTBCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTQ CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

QACACCTTCC TTAAAACTGT TCTTCCAGCT GTGCCCAGTO ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGC3VTCTCA TTGTATCAAA TTCTQCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACRTCTGTA CCAQTTTTTQ ATGTGTOGCC TACTTCTCAT 3900 

ATGCACXCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAA6TGAGAA ATATGAACCA 3960 

aTTTTOTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGA6 4020 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGOCCATC CCCCAAAACG AAGGCATQTA 4080 

TTTGCTACAC CTGTTTTATC AATTOATOAA CCATTAAAZA CACTAATAAA TAAGC TTATA 4140 

CATTCCQATO AAATTTTAAC CTCCACCAAA AGTTCTQTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCSVTTACaGC TGTTTCTCCC CACAGAGATG GTTCTQTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 43 80 

TTAGTGGGTG GTGGTGAAGA TGGTGACACX GATGATGATG GTGATOATGA TGATGATGAC 4440 

AGAGQTAGTQ ATGGCTTATC CATTCATAAG T6TATGTCAT GCTCATCCTA TAQAGAATCA 4500 

CAGGAAAAGO TAATGAATGA TTCAOACACC CACQAAAACA QTCTTATGQA TCAGAATAAT 4560 

CCAATCTCAX ACTCACTATC TGASAATTCT GAAQAAOATA ATAOMIICAC AAGTOTATGC 4620 

TCAGACBflTC AAACTOGTAT GOACAOAAGT OCTQOTAAAT CACCATCAGC AA AJGG GCTA 4680 

TCCCAAAAGC ACAATGATG6 AAAAGAGOAA AATGACATTC AOACTGOTAG TGCTCTGCTT 4740 

CCTCrgAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GTGATGAAGA AAGTGGATCA 4800 

GGGCAAGGTA CCTCAGATAG CCTTAATGAG AATGAGACTT CCACAGATTT CAGTTTTGCA 4860 

GACACTRATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACtC3MSA AATAACTCCT 4920 

GGKrCCCCAC AGTCCaaAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTTCCACGTT 4980 

TCAOAGGCAQ AGGCCAQTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

OAATCOGAGA AGAAGGCAGT TRTACCCCTT GTGATCXJTGT CAGCXXTGAC TTTTATCTGT SlOO 

CTAGTOGTTC TTGTGGGTAT TCTCaTCTAC TGGAGQAAAT GCTTCCBOAC TGC ACAC TTT S160 

TACTTAOKX3 ACAGTACATC CXXrrAOAen ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCGGAGC AATTCCRATA AAOCACTTTC CAAAGCATGT TGCAGATTTA 5280 

CATGCAA6TA GTGGGTTTAC TQAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

CAQAGCTGTA CTGTTGACTT AGGTATTACA GCAGAC3M3CT CCAACCACCC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATCX3TTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 5460 

CTTQCTQAAA AGQATGQCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATCGCTAC 5520 

AACA6ACCAA AAGCT7ATAT TGCTGCCCAA GGCX:CACTGA AATCCACAQC TGAAGATTTC 5580 

TQGAGAATGA TATGGGAACA TAATGTGGAA GTTAtTOTCA TGATAACAAA CCTOGl'GQAa 5640 
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AAAGGAftGGA GAAAATGTGA 
TTTCTOGTCA CICAGAAGAG 
CTAAGAAACA CAAAAATAAR 
ACACACTATC ACTACACX>CA 
CTQACCTTTG TGAGAAAGGC 
CACTGCAGTG CTGGAGTTGO 
CAGATTCAAC ACGAAGGAAC 
AGAAATTATT TGQTACAAAC 
GCCAXACTTA GTAAAGAAAC 
CTCCTCATTC CTGGACCAGC 
CASTCAAATA TACAQCAOAQ 
AATC6AACTT CTTCTATCAT 
GQAGAAGGCA CAGACTACAT 
TTCATCATTA CCCAGCACCC 
GACXaVTAATG CCCAACTGGT 
TTTGTTTACT GGCCAAATAA 
ATGGCTGAAG AACACAAATG 
TTAQAAGCTA CACAGGATGA 
CX»AATCCAG ATAGCCOCAT 
GCIOCCAATA GGQATQG6CC 
ACTTTCTGTG CTCTGACAAC 
TACXSkGOTAG CCAAQATGAT 
TATCAGTTTC TCTACAAAGT 
TCCACCTCTC TGGACAGTAA 
GAGTCTTTAG TTTAACACAG 
TTCCTAAAAT TAGGCSkGGAA 
GACAGTAACT TTCATGACAT 
CmTTGCAA GACTTGTAAT 
TATTTCTAAG AATGGAATTa 
TTTATAGAGO TTAGGAATTC 
GCIOTATTTG TAGCAATIAT 
AAATAAAACA CTCTTCCATA 
AGAAATAATC TGTTACTTAT 
TTTATAATTG TAQATTTTTA 
GTTTAGTTTA ATGACQTAGT 
TTGTGTTACC TAAGTCATTA 
GfiAATACCTT CATTTTGAAA 
AATGGTTTTT ATCCAAGGAA 
AAAAAAAAAA AAAAAAAAAA 



1 

MRILKRFLAC 
QSPIMIDEUJL 
FKASKITFHW 
ILFEV6TEEN 
TDTVDWIVFK 



AAAOGGCTCC 
GtGGCCTGAC 
AGCCTATGCC 



TGTCAACATA 
TGAGGAGCAA 
TGAGGTGCT6 



TGACTATTCT 



CAATGCCTCC 
TCTCCTTCAT 
GGTTATGATT 



TCTATCTAAT 



TCACTACTGG CCTGCXSATG G 
CTTGCCTATT 
CAQAAAGQAA 
ATGGGAGTAC 
AAGCGCCATG 
ACATATATTG 
TTTGGCTTCT 
TATGTCTTCA 
GACAGTCATA 
AAGCTAGAGA 
GCAOCCCTAA 
AGATCAAGGG 
TATATCATGQ 
ACCATCAAGG 
CCTGATGGCC 
ATAAATTGTG 
GAGGAAAAAC 
GAAGTQAGGC 
TTTGAACTTA 
CATGATGAGC 
CAACTAGAAA 
AGGCCAGGAG 
CTTGTGAGCA 
TTGCCTGATG 
GGGGACTOVC 
GTTCTGTTAT 
CGCaVAATTT 
TGTTTGAACT 
TXCTGTATTG 
AAAATGTTTG 
AGAAATATAA 
CATTTTACAA 
GCCCXAGTOT 
CTGAGTCAAQ 
GTCTTACTCT 
AOCATGTAAT 
TGAGAATAAC 
AAATATAAAT 



CAGAGTACTC 
CAGTGGGGCC 
TGCTAGACAG 
TAAAACACAT 
TTCATGATAC 
TTCATGCCTA 
AACAATTCXA 
AGCAATGCAA 
TTGGCATTTC 
GCTATTACCA 
ATTTCTGGAG 



CCTGCCAGTO 
TGTTGTCGTC 
TATGTTGCAG 
CX53TTCACAA 
ACTGGTTGAG 
TGTTAATGCA 



CAGGGAAAAG 



TAGTAAAACT 
TATQATTGTT 
CCTTATGCAC 
CAATCTGATG 
GATCCTCAGC 



TTACTTATTA 
TGGTATTTTT 
CAAACTACAG 



TGATATTCAA 
TOTAAATACT 
TATTTTACTA 



AQAGCTTTAA 
TTATAATTCA 
ACTTTCAGTG 
TAAGTGTTAT 
ATGGAGQAQT 
AAGAAAATTC 
TCTTTGCTGA 
CAAGGCAGGA 
GAAATATAGC 
ATCTGAGCAT 
CTGTTGATTT 
ATATCATTAA 
AAAATGATTG 
ATTTTA ACAG 
TTTTTAGTGT 
CnrTAATAC 



GAGCAATGAA 
GATGATA1GO 
ASAAGAT6AA 
GGTCACTCrr 
GGACTTTATC 
TCCTAAATGG 
AAAAGAA6AA 
GAOMCAGGA 
CGTGGATGTT 
CATTGAGCAG 
AGAGAATCCA 
TGAGAGCTTA 
TGTTTTCCTC 
CCCATCACCT 
CAATGTGTGC 
AATTTTACAG 
AAAATTTCAA 
CAAATTTTTA 
AGTAGCCTGT 
CACCTAAAGT 
CAAATTTATA 
CTGTGTAATT 



ACTTTGTTTC 



CTCCATGGAC 
TTTTCTAGTT 
ACC3U3TTTTC 

TTTAACTTTT QTGGAAAATA 



5700 
5760 
5&20 
5880 
5940 

eooo 

6060 
6120 
GIBO 
6240 
6300 
6360 
6420 
64S0 
6540 
6600 
6660 
6720 
6780 
6B4D 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7S60 
7620 
7680 
7740 
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TTGCAAAAAT 



TQVNVNLKKL 
GKOmSSDGS 
IiDFKAIIDGV 
DTVSISBSQL 



BSVSRFGKQA 
AVFCEVLTMQ 
DPSHYTSLLV 
NMSYVIiQIVA 



GSKTVLRSPH 
ENISQGYIPS 
TAQPDVGSGR 
TBVTPHAFTP 

urrcPAASSs 

II.F(2VTaATB 



U NSTSQPVTKL 



ESPLQTNYTE 
SSRQQDLVST 
DSALHATPVF 



VLIPBSARNA 
IRVDESEKTT 
VNWYSQTTQ 
PSVpVSFESI 
PVA6GDLLLE 



KLVEEIGWSY 
EajTPIHNTGK 
IfMQIYCFDA 
ALPPFILUIII. 
QSGYVHLMDY 
TWERPRWYD 
IdHGLYOKr 
TNQIRKXEPQ 
AIEKDISLTS 
ESIiLTSFKIiD 



!k. AAAAAAAAAA 
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I 



TGALNQKNWG KKyPTOISPK 
■rVBINLTHDY RVSGGVSEKV 
DRFSSFBEAV KGKaia<RAI.S 
UNSTDKYYl XNGSLTSPPC 
liQNHFREQQy KFSRQVFSSY 

•skxzkpkhvc qquigeiiqtk 
sdqlivdmpt onpeij}i.ffe 



KSFSAGPVMS 
PVYNGETPIiQ 
IiSSYDGAFUi 



QIVTEIjPPBT VBGTSASUID 540 

TQABDSSGSS PATSAIPFIS 600 

ESLKDPSMEG NVWFPSSTDI 660 

QGPSVTDLEM PHYSTFAYFP 720 

PSYSSEVFPL yTPLLLDNQI 780 

PFSSASFSSB IiFREUmrSQ 840 

EF6SBSGVLY 900 



? TVSYSSAIFV f 



AVPSDPILVE 
TISYASEKYE 
EPIiHTLMKL 
FEEiOGSVTST 



LSPSTQIAFY B 
MLKLIVSNSA £ 
QWPSIiYSND E 
KSSVTGIWFA C 



t EMVYPSESTV 
S NNFSVQPTHT 
r UiQPSFQASD 
3 VPVPDVSPTS 
[ NQAHPPKGRH 



3 BLSHSAXSOA GLVGGGEDGD TODOGODODO ORGSDOIiSXR 
SEM SEEDNRVTSV SSDSQTOIDR 
KAW AVLTEDEBSG 5GQGTSDSUI 



VISTPPTPIF 
TADSSMHPON 
QGFLKSTAED 
VLAYYTVRMF 



LESEKKAVIP 
PISDDVQAIP 
KHKimyiNIV 
FWRMIWEHNV 
TLRNTKIKKQ 



SNDIQTGSAL 
ILAAGDSEIT 
IiVIVSALTFI 
IKHPPKHVAD 
AYDHSRVKIA 



AKRKAVQPW VHCSAGVGRT GTYIVUJSMIj 
QyVFIHDTIiV EAIMKETEV LOSHIHAYVN 



KHRTSSIIPV 
WDKNAQLWM 
I1.EATQDDYV 
GTFCALTTLM 
PSTSU3SKGA 



ERSRVGISSL 
IPDQQNMAED 
I.EVBHPQCPK 



CLWLVGIIiI 
LHASSGFTEE 
QLAEKDGKLT 
EKGRRKCDQY 
VTQYHYTQMP 
QQIOREGTVH 
AIJ.IPGFAaK 
SGBSroyiMA 



YVniKCFQTAH 
FEIiatEFYQE 
DYINANYVDG 
WPADGSEEYG 
tlMGVPEYSI.F 
IFGFLKHIRS 
TKLEKQFQLI. 
SYIMGYYQSN 
PINCESFKVT 
TFBLISVIXE 
MRPGVFADIE 



FYLEDSTSPR 
VQSCTVDU3I 
YNRPKAYIAA 
NFLVTQKSVQ 
VLTPVRXAAY 
QRUYW/QTEE 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



IBOO 
1860 
1920 



BFIITQHPI.D 



ALPOGNIAES I 
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CACaCATACG 



CAGCTCCTCT 
CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATG 
AACACA.TTCA 
QTCAGCGGAG 
AAATGCAATA 
GAGATGCAAA 
GGAAAAGGGA 
QATTTCAAAO 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTA60V 
TCTGQTTATG 
TTCTCTAQAC 
AGTTCAGAAC 
TGGCjAAAQAC 
CAGTTGGATG 
GGTGCTATTC 
TGCACTAATG 
AATCCTGAAC 
GAAGAGGGAA 



CACGCACGAT 
ATTTCCTTCG 
CCGCAGACCG 



AGATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 
GASmCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
CGATTATTGA 



CTCCCCCTCC 
TCTGGAAATG 
CCTGGATTGG 
GTCCTATACA 
CCCAAAACAA 



31 
I 

TCTATACACT 
CTCTCCACTC 
CGAATCCTAA 
GCTAATGGAT 
GGAGCACTGA 
TCTCCTATCA 



AATGGTGTTT 



ACGAAATACA 
AAGGGTGATG 
ACAGAAAAAQ 
GAAGGTACTT 

AAcrrexcGG 

AGTTTATTQA 
GCAACTTCTG 
QAAAACCCAG 
GAAGATTCAA 
GTGTGGTTXC 
AaCTTTCTCX: 
TCCn 



TGACATCTCC 
TCTCTGAAAG 
TCATQCTGAT 
AGGT6TTTTC 
CAGAAAATGT 
CTCGAGTCGT 
QAQAGGACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 
ATGAAGCCAA 
TTCCCAATRC 
ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 



GAACCTTCTG 
TCCXTTGCACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 
AACCAAGCAT 
GCTACCCAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAGGCGCT 
ACCCCAGATT 



QTOGAAATTA 
AAAGCAAGCA 
CATAGTTTAG 
OGATTTTCAA 
TTGTTTGAGG 
AGTGTTAGTC 
CCAAACTCAA 
GACACflGTTG 



ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
QCTTGATACT 
CATCTCTGAG 
ATATGATGTC 
TTCAGAAGAA 



CAAAACAATT 
GGAAAGGAAG 
CCAGAGAATT 
ATGATTGAGA 
GAATrmOA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 
ATTGTGAATC 
TCTACCACAA 
TCCCCSUVCAA 
TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTQ 
ACAGTTTCTA 
GGAGCTGAAG 



TGAGAAGCAG 
AGQ3TTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGA-CAAAAC 
ATCTCftCTAA 
AGATAACTTT 
AAGGACAAAA 
GTTTTGAGGA 
TTGGG ACAGA 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGATTQT 
AAGTTCTTAC 
TTCGAOAGCA 
AGATTCATQA 
ATACCAGCCT 
AGTTTGCAOT 



AACAAACAAA 
AGGAGCCGCA 
CGCTTGCATT 



TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 



TTCTTCAQAT 
TTGTCGACAT 
AAGAAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAOTGA 
AACCAGTCAC 



AGCAGTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TCAAGACTTG 
AGTAGCCATA 
GCCTACTGAT 
CAAGGAGGAG 
CAGTGCTACA 
TCGCATAGGG 
ATTCTCTGGA 
TAAATTAQCC 
TCACACTGTG 



CTTATACCAG 
TCACTAAAGG 
GCACAGCCOG 
OGTGTTGATG 



CATTATTCTA CCTTWWCTA CTTOCCAACT OA OGTAAC AC 
AAC6TGGTAT 
CATGAGTCTC 
CTTGTGATQO 



TAACAGAATA 
ATTCTTGAGQ CTCCAGTCCC 
AAGGOTATAT ATTTTCCTCC 
AATCTGCTAG AAATGCTTCC 
ATCCrrCTAT GGAGGGAAAT 
ATGTTGGATC AGGCAGAGAG 
ARTCTGAGAA GACAACCAAG 
TTACAGATCT GGAAATGCCA 
CTCATGCn 



r TACCcxaTcc 



TTTTACTTAG 
CCAATTTCAG 
TTACATGCAA 



TTCTTGTGGG 
AGGACAGTAC 
ATGATGTOGG 
GTAGTGGGTT 
GTACTGTTGA 
ATOQATACAT 



TAATAOTAGC 
AGTTATACXX: 
TATTCTCATC 
ATCCCCTAGA 
AGCAATTCCA 
TACTGAAGAA 
CTTAGGTATT 
AAATATOGTT 



GTATTGGTCT 
TOTCAGCCCT 
AATGCTTCCA 



GACTTTTAI 



TACAACAOAC CAAAAGCTTA TKl 



GAGAAAGGAA 
AACTTTCTGG 
ACTCTAAGAA 
GTCACACAGT 
GTGCTGACCT 
6TCCACTGCA 



GGAGAAAATG 
TCACTCAGAA 
ACACAAAAAT 
ATCACTACAC 
TTOI6AGAAA 



CAAAOAAATT 
GAGGCCATAC 

GcacTccrcA 

AGCCAGTCAA 
AAGAATCGAA 



ATTTQGTACA 
TTAGTAAAGA 
TTCCTGGACC 
ATATACAGCA 
CTTCTTCTAT 



ACATAATGTG 
TGATCAGTAC 
GAGTGTGCAA 
AAAAAAGGGC 
GCAGTGGCCT 
GGCAGCCTAT 
TGGAAGAACA 
AACTGTCAAC 
AACTGAGGAG 



ACAGCAGACA 
GCCTATGATC 
GATTATATCA 
CAAGGCCCAC 
GAAGTTATTG 
TGGCCTGCCG 
GTGCTTGCCT 
TCCCAGAAAG 
GACATGOGAG 
GCCAAGCS3CC 
GGCACATATA 
ATATTTGGCT 



TTCCAAAGCA 
TOAAAQAGTT 
GCTCXAACCA 
ATAGCAGGGI 
ATQCCAA1TA 
TGAAAICCAC 
TCATGATAAC 
ATGGGAGTGA 
ATTATACTGT 
GAAGACCCAG 



ACCTATCTTT 
TGTTGCAGAT 
TTACC»GGAA 
CCCAGACAAC 
TAAGCTAGCA 
TGTTGATGGC 
AGCTGAAGAT 



GAATTCATCA 



6AATTT8TTT 
CrXATQGCTG 
ATCTTAGAAG 
TGGCCAAATC 
GAAGCTGCCA 
GGAACTTTCT 
GTTTACCAGQ 
CAGTATCAGT 
CCATCCACCT 

rrAaAGTCTT 

CTCTTCCTAA 
CCTGACAGTA 
TGCCTTTTTG 
C3M3TATTTCT 
CAATTTATAO 
TXAGCTGTAT 
TGXAAATAAA 



TTACCCAGCA 
ATGCOCAACT 
ACTGGCCAAA 



GTQCTCTGAC 
TAGCCAAQAT 
TTCTCTACAA 
CTCTGGACA3G 
TAGTTTAACA 
AATTAGGCAG 
ACTTTCATGA 
CAAGACTTGT 
AAGAATGGAA 
AGQTTAOQAA 
TTGXAGCAAT 
ACACTCTTCC 



AGCAGGGAAA 
GAGTGACTAT 
CATCCCTGTG 
CATCAATGCC 
CCCTCTCCTT 
GGTGGTTATG 
TAAAGATGAG 
ATGTCTATCT 
TGATTATGTA 
CATTAQTAAA 
GCCTATOATT 
AACCCTTATG 
GATC31ATCTQ 
AGTQATCCrC 
•TAATGGTGCA 



CTGGACAGTC 
ACAAAGCTAQ 
TCTGCAGCOC 
GAAAGATCAA 
TCCTATATCA 
CATACCATCA 
ATTCCTGATQ 
CCTATAAATT 
AATGAGQAAA 
CTTGAAGTGA 
ACTTTTQAAC 
GTTCATGATG 



ATGCAGTGGG 
TTGTGCTAGA 
TCTTAAAACA 
TCATTCATGA 
ATATTCATGC 
AGAAACAATT 
TAAAGCAATG 
GGQTTGGCAT 
TGGGCTATTA 
AGGATTTCTG 
GCCAAAACAT 
GTGAGAGCTT 



GGAGTACGGG 
GAGGAATTTT 
TGGACGTGTG 
CTCCCTGCCA 
GCCTGTTGTC 
CAGTATGTra 
CATCCGTTCA 
TACACTGGTT 
CTATGTTAAT 
CCAGCTCCTG 



TTCATCCXTQ 
CC3UBU3CAAT 
GAGGATGATA 
GGCAGAAGAT 
TAAGGTCACr 
TCAGGACTTT 
GTGTCCTAAA 
TATAAAAGAA 



1020 
lOBO 
1140 
1200 
1260 
1320 



1620 

leeo 

1740 

laoo 
laeo 

1920 



2220 
2280 
2340 
2400 
2460 
2S20 
2560 
2640 
2700 
2760 



3S40 
3600 
3660 



3960 
4020 
40 ao 



ATOAGGCCAO GAGTCTTTGC 



CACATCTGAG 
TATCTGTTGA 
TTTATATCAT 
ACTAAAATGA 
TTGATTTTAA 
TTGCAAACTA CAGAAAATGT TTGTTTTTAQ 
TAACTTTTAA 



GAAAATCAGT 
CATAGGATTC 
AATTTACTTA 



CTAGTTCTQT 
TGCOGCX»AA 
TTATGrrTQA 



TGACATTGAG 
G6AAQA6AAT 
AGCTGAGAGC 
CATTGTTTTC 
TTTCCCATCA 
TAACAATCTG 
TTOAATTTTA 



r CAACATTTTA C 



TGTCAAATTT 
TACAOTAGCC 
ATTOMXTAA 



4380 
4440 
4500 
4560 

4680 
4740 
4800 
4860 
4920 
4960 
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AGTAGAAATA ATCTGTTACT TATTQTAAAT ACTGCCCTM} TGTCTCCATG GACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCIA GTTCIGIGTA 5100 

ATTOTTTAGT TTAATGACGT AGTTCATTAG CTG G T C CT A C TC TAOC AGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTQT TTCAOCATQT AATTTTAACT TTTGTGGAAA 5220 

5 ATA«3AAATAC CTTCATTTTO AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

• TCAAATQGTT TTTATCC3U«3 GAATTCCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



10 



Seq ID NO: 182 Protein sequence: 
Protein Accession »s Eos sequence 



1 11 21 31 41 51 

,c I I I I I I 

15 MRII.iCHFIJM: IQLLCVCRIiD WANGYYRQQR KLVEEIQWSV TGALNQKNWQ KKYPTCNSPK 60 

QSPINIDBDL TQVNVNIjKKI. KPQGWDKTSL ENTFIIOrrGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHH GKCNMSSDGS EHSIiBGQKFP LEMQIYCFDA ORFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEM LOFKAIIDGV ESVSRFGKQA ALDPFILLML LPNSTOKYYI VHGSLTSPPC 240 

TDTVDWIVPK DTV8I8ESQL AVFCBVI.'mQ QSGWHLMDY LfflMIFRBQQY KFSRQVFSSY 300 

20 TGICEEIHEAV CSSEPEHVQA DPBMYTSLLV THBRPRWYD IMIEXFAVLY QQL06EDQTK 360 

HEFI.TD6SQD LQAIUINUJ) MMSYVLQIVA ICTUGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEBGKDIEHG AIVMPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTH 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDIS1.TS QTVTEIiPPHT VBGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

25 ENISQGYIFS SKNPETITYD VLIPESARNA SEDSTSSGSE BSUKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFIiQTNYTB IRVDESBKTT KSFSAQPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEWTPHAPTP SSRQQDIiVST VNWYSQrTQ PVYNABASNS SHBSRIOIAE GLESBKKAVI 780 

PLVIVSALTF ICLVVLVGIL lYHRKCFQTA HFYIiEDSTSP RVISTPPTPI PPISO DVSAI 840 

PIXRFPKBVA DLHASSGFTE BFETUCEFYQ EVQSCTVDIiQ ITADSSHHFD NKHKHRTTHI 900 

30 VAYDHSRVKI. AQLABKDGKL TDYHTAHYVD GVNRFKAYIA AQGPLKSTAB DFHSMIWERN 960 

VEVIVMITHL VEKGRRKCDQ YWPADGSEBY ONFLVTOKSV QVLAYYTTON FTLRNTKIKK 1020 

GSQKGRPSGR WTOYKYTQW PDMGVPEYSL PVLTFVRXAA YAKRKAVGPV WHCSAGVOR 1080 

TGTYIVI<DSM LQQIQHEGTV NlPGFIjKHIR SQRNYr.VQTE EQYVFIHDTL VEAILSKETE 1140 

VIJ3SHIHAYV NALLIPGPAG KTKLEKQFQL LSQSNIQQSD YSAALKQCMR EKtlRTSSIIP 1200 

35 VEHSRVGISS LSGBGTDYIN ASyiMGYYQS NEFIITQHPI. mriKDFWRM IWDHNAQLW 1260 

MIPOGQNMAS DEFVYWPNKD EPINCESFICV TLMAEEHKCL SHEEKLIIQO FILEATQOOY 1320 

VLEVRHFQCP KWSHPDSFIS KTFEIiISVIK EEAAimOGPM IVHDEHG6VT AGTFCALTTIi 1380 

MHQLEKENSV DVYOVAKMIN LHSPGVFADI EQyQFLYKVI IiSLVSTRQEE NPSTSLDSMG 1440 
AAIiPDGHIAE SLESI.V 

40 

Seq ID MO: 183 DMA sequence 

Nucleic Acid Accession #: E»S sequence 

Coding sequence: 148-4494 

45 1 11 21 31 41 51 

I I I I I I 

CACACATACQ CACSSC310GAT CTCACTTCQA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAQ AGGAGCCX3CA 120 

CXXXXSAGGQG CCGCAQACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

50 CAGCTCXrrCT GTGTTTGCCG CCTGOATrGG GCTAATGGAT ACTACAGACA ACAQAGAAAA 240 

CTTGTTGAAG ASATTGGCTG GTCCTATACA GQAGCACTGA ATCAAAAAAA TTGGGGAAAa 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGAT CTTA CA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGOAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA T6ACTAC0GT 480 

55 GTCAGCXK3AG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACIGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCAGAC CGATTTTCAA GTTTTGAGGA AGCAQTCAAA 660 

GOAAAAGGGA ASTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AOAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTOGAA AGTGTTAGTC GTTTTGGGAA GCAOGCTGCT 780 

60 TTAGATCXIAT TCATACTGTT GAACCITCT6 CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TOACATCTCC TCCCTaC3«3V O ACACAa TTO ACIGSATTQT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTQAAAO CCAUTTUGCT GtTTTTTGTa AASITCTTAC AATGCAACAA 960 

TCTOOTTATG TCATOCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAG TACA AG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAOTTTOT 1080 

65 AGTTCS«3AAC CSkGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAOCCT TCTTGTTACA 1140 

TGGGAAAOAC CTCX3AGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAOGACCA AACCAAGCRT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT QCTACCCRAT ATGAGTTATa TTCTTCaiGAT AfiTAGCCATA 1320 

TQCACTAATG QCTTATATCO AAAATACAOC GACCAACTGA TTOTGQACAT GCCTACTOAT 1380 

70 AATCCTGAAC TTGRTCTTTT COCTGAATTA ATTOGAACTQ AAQAAATAAT CAAOOMQAO 1440 

GAAOAGGGAA AAGACATT6A AGAAGGCGCT ATTGIGAATC CTGGTAQAGA CAGT6CTACA ISOO 

AACCAAATCA GGAAAAA6GA ACCCCAGATT TCTACCACSA CACACTACAA TOGCATAOGG 1560 

AOGAAATACA ATGAAGCXaA GACTAACCX3A TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCXCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

75 ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTOTGACTG AACTGCX»CC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GQACTQCAOA ATCCTTAAAT ACnGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

ACTTTATTGA CCAQTTTCAA GCTTGATACT GGAQCTQAAG ATTCTTCAGG C TOCAG TCCC 1920 

- _ GCAACITCTG CTATCCCATT CATCTCTQAO AACATATCOC AAGGGTATAT ATTTTCCTCC 1980 

80 GAAAACCCAG AOACAATAAC ATATGATQTC CTTATACXaMJ AAIdGCTAO AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAOS ATCCTTCTAT GGAGGGAAAT 2100 

GTGTG6TTTC CTAGCTCTAC AGACATAACA GCACAGCCOG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCrCC AGACTAATTA CACTGAGATA OGTGTTGATG AATCTGAOAA QACAACC&AG 2220 

_ TOCTTTTCIG CAGGCXCAGT GATGTCAC3VG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

85 CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAOACAAC AGGATTTGGT CTCCACGOTC AAOQTGGTAT ACTC6CAGAC AACCCAACCX3 2400 

GIATACAAT6 AGGCCAGTAA TAGTAGCCAT QAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 
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WO 02/086443 

GAATCCGAOA ASAAGGCAGT 
CTAGTGQTTC TTGTGGGTAT 
TACTTAOAGG ACAGTACATC 
ATTTCAGATG ATGTOGGAGC 
CATGCAAGTA GTGGGTTTAC 
GGTATTACAG CAGACAGCTC 
ATCQTTGCXrT ATGATCATAG 
CTGACTGATT ATATCAATGC 
GCTGCCCAAG GCCCACTGAA 
AATQTQQAAO TTATIGTCAT 
CAGTACTGGC CTGCCGATGQ 
GTGCAAGTGC TTGCCTATTA 
AAGGGCTCX:C AGAAAGGAAG 
TGGCCTGACA TGGGAGTACC 



PCT/US02/12476 



TATACCCCTT GT6ATGGTGT CAGCCCT6AC TTTTATCTGT 
TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 
CCCTAGAGTT ATATOCACAC CTCCAACACC TATCTTTCCA 



AATTCCMTA 
TGAAGAATTT 
CAACCACCCA 
CAGGGTTAAG 
CAATTATGTT 
ATCX31CAGCT 
GATAACAAAC 



GAGGAAGTGC 
GACAACAAGC 
CTAGCACAGC 
GATGGCTACA 
GAAGATTTCT 
CnCGTCQAGA 



CAAAGCATGT 
AGAGCTGTAC 
ACaAGAATCX3 
TTGCTGAAAA 
ACAGACCAAA 
GOAQAATOAT 



AGAACAGGCA CATATATTGT 
GTCAACATAT TTGGCTTCTT 
GAGGAGCAAT ATGTCTTCAT 



GACTATTCTG 
CCTGTGGAAA 
AATGCCTCCT 
CTCCTTCATA 
GTTATGATTC 
GATGAGCCTA 
CTATCTAATG 
TATGTACTTG 
AOTAAAACTT 
ATGATTGTTC 
CTTATGCACC 
AAICTGATGA 
ATCCTCAGCC 
GGTGCAGCAT 
AAGGGGTGGG 
ATCAGTCTAG 
GGATTCTQCX: 
TACTTATTAT 
GGTATTTTTT 
AAACTACAGA 



GATATTCAAC 
GTAAATACTG 
ATTTTACTAC 
CATTAGCTGG 
CTrTGTTTCA 
AAGTTTTTAT 
TGCAAAAATA 
AAA 



AGCTAGAGAA 
CAGCCCTAAA 
GATCAAGQGT 
ATATCATGGG 
CCATCAAGGA 
CTGATGGCCA 
TAAATTGTGA 
AGGAAAAACT 
AAGTGAGGCA 
TTQAACTTAT 
ATQATGAGCaV 
AACTAGAAAA 
GGCCAGGAGT 
TTGTGAGCAC 
TGCCTGATGG 
GGGACTCACA 
TTCTGTTATC 
GCCAAATTTA 
GTTTQAACTA 
TCTGTATTGA 
AAATGTTTGT 
GAAATATAAC 
ATTTTACAAC 
CX:CTAGTGTC 
TGAGTC3U«3T 
TCTTACTCTA 
GCATGTAATT 
GAGAATAACA 
AATATAAATA 



ACOCAGTGGA 
AGAGTACTCC 
AGTGGGGCCT 
GCTAGACAGT 
AAAACACATC 
TCATGATACA 
TCATGCCTAT 



CTOCCA6TGC 
GTTGTC6TCC 
ATGTTGOWSC 
CGTTCACAAA 



TTCTGGTCAC 
TAAGAAACAC 
CACAGTATCA 
TGACCTTTQT 
ACTGCAGTGC 



CTTAATGCAC 



GAAATTATTT 
CCATACTTAG 
TCCTCATTCC 



GCAATGCAAC 



CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
CTTTCAGTGT 
AAQTGTTATA 



AGGGAAAAaA 
TCCCTGAGTO 
AGCAATGAAT 
ATGATATGGG 
GAAGATGAAT 
GTCACTCTTA 



CCTAAATGGC 



AGAAAATTCC 



AAATATAGCT 
TCTGAGCATT 
TGTTGATTTC 
TATCATTAAC 
AAATGATTGA 
TTTTAACAGA 
TTTTAGTGTC 
TTTTAATACA 
TGCAGTATTC 
TCCATGGACC 
TTTCTAOTTC 



TTAACTTTTG 
TTGCXaVTTAA 



GTGGATGTTT 
ATTGAGCAGT 
GAGAATCCAT 
GAGAGCTTAG 
GTTTTCCTCT 
CCATCACCTG 
AATGTGTGCC 
ATTTTACAGT 
AAATTTCAAT 
AAATTTTTAG 
GTAGCCTGTA 
ACCTAAAGTA 
AAATTTATAT 
TGTGTAATTQ 
GACATTGTAT 
TGGAAAATAG 
CATTGTTCAA 
AAAAAAAAAA 



ATC6AACTTC 
GAGAAGGCAC 
TCATCATTAC 
A(XATAAIGC 
TTGTTTACTG 
TCGCTGAAGA 
TAGAAGCTAC 
CAAATCCAGA 
CTGCCAATAG 
CTTTCTGTGC 
ACCAGGTAQC 
ATCAGTTTCT 
CCACCTCTCT 
AGTCTTTAGT 
TCCTAAAATT 
ACAGTAACTT 
TTTTTGCAAG 
ATTTCTAAGA 



TGCAGATTTA 
TOTTGACTTA 
ATACATAAAT 
GGATGGCAAA 
AGCTTATATT 
ATGQQAACAT 
AAAATGTGAT 
TCAGAAGAGT 
AAAAATAAAA 
CTAC»CGC3M3 
GAGAAAG6CA 
TGGAGTTGGA 
CGAAGGAACT 
GGTACAAACT 
TAAAGAAACT 
TGGACCAGCA 
ACAGCAGAGT 
TTCTATCATC 



2S20 
35B0 
2640 
2700 
2760 
2820 
2S80 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



CCRGCACCCT 
CCAACTGGTG 
GCCAAATAAA 
ACACAAATGT 
ACAGGATGAT 
TAGCCCCATT 
GGATGGGCCT 
TCTGACAACC 
CAAGATGATC 
CTACAAAGTG 
GGACAGTAAT 
TTAACACAGA 
AGGCAGGAAA 
TCATGACATA 
ACTTGTAATT 



CTGTATTTGT 
AATAAAACAC 
GAAATAATCT 
TTATAATT6T 
TTTAGTTTAA 
TGTGTTACCT 
AAATAOCTTC 



TAGGAATTCC 
AGCAATTATC 
TCTTCCATAT 
GTTACTTATT 



I 

MRIUCRFIAC IQLLCVCRLD 
QSPINIDEDL TQVNVNLKKL 
FKASKITFHW GKCNMSSDGS 
ILFEVGTBSN LDFKAIIDGV 
TDTVDWIVPK DTVSISBSQIi 



21 



31 



IVQA 



I I 
WANGYYRQQR KLVEEIGHSY 
KFQGWDKTSD aiTFIHNTGK 
EHSI.EGQKFP LEMQIYCFDA 
BSVSRFGiCQA AIiOPFILLltl. 
AVPCEVLTMQ QfiGWMLMDY 



GSKTVLSSPH MHLSGTABSL 
EHISQGYIFS SENPBTITYD 
TAQE>DVGSGR ESFLQTNVTE 
TEVTPHAFTP SSRQQOLVST 
LVIVSAIiTFZ CIiWIiVGIIiI 
IKHFFKKVAD LBA8S6FTSE 
KLAQIAEKDQ KLTDYINANY 



NNSYVLQIVA ZCINGI.X6KY 
AIVNFGRDSA THQIRKKBFQ 
NSTSQPVTKL ATEKDISIjTS 
NTVSITEYEE ESLLTSFKLD 
VI.IPESAH1IA SEDSTSSGSE 
IRVDESEKTT KSFSA6PVMS 
VNWYSQTTQ PVTOEASNSS 



TGALNQKNWG 
TVEINLTNDY 
DRFSSFBEAV 
LPNSTOKYYI 
LQHNFSEQQY 
THIBKPAVLY 



QQPSVTOI.EK 
HESRIGLAEG 
VISTPPTPIP 



GRWTQYHYT QWPDMGVPEY 
SMLQQIQHEG TVNIFGFLKH 
YVNAIiLIPGP AGKTKLEKQF 
SSLSGEGTDY INASYIMGYY 
ABDEFVYWPN KDBPINCGSF 
CPKWPNPDSP ISKTFEI.ISV 



SLPVLTFVRK 
IRSQRNYLVQ 
QLLSQSKIQQ 
QSNEFIITQB 
KVTLMAEEHK 



TEEQYVFIHD 
SDYSAALKQC 
PLI.HTIKDFW 
CIiSNEEKIiII 



NREKNRTSSl 
RMIWDHNAQI. 
QDFIbEATQD 
VTAGTFCALT 



3840 
3 900 
3960 
4020 
4080 
4140 



4320 
4380 
4440 
4500 
4S60 
4620 



4920 
4980 
5040 
5100 
5160 



KKYPTCNSPK 
RVSGGVSEMV 
KGKGKLRALS 
YNGSLTSPPC 



QQLDGEDQTK 
DHPELOIiFPB 
GTKYMEAKTN 
VEOTSASLND 
PATSAIPFIS 
NVWPPSSTDI 
PHYSTPAYFP 
LESBKKAVIP 
PISDDVGAIP 
MIVAYDHSRV 
aNVEVI\ 



[ KKGSQKGRPS 



TBVIiDSHIHA 
IPVERSKVGI 
WMIPDGQNM 
DYVIiBVFHFQ 



SVDVYQVAKH INLMRP6VFA DIBQYQFLYK V1I.SI.V8TRQ BBIPSTSUIS NGAALPDGtil 



85 



Seq ID HO: 185 OHA sequence 

Kucleic Acid Ac«sesaioa «< BOS aequance 

Coding sequence! 501-4S14 



260 



5 

10 

15 
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75 
80 
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CACACATACG CTlCGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA. 
CAAAAAAAAC ATTTCCTTCG CTCCXTCCTCC CTCTCCACTC TGAGAAGCAG AGGA<3CCGCA 
CGGCGA«3G<3G CCX3CAGACCG TCTGGAAATG CX3AATCCTAA AGOGTrTTCCT CGCTTGCATT 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 
CTTQTTGAAG A(3ATTGGCrG GTCCTATACa. GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 
AATATCCAAC ATGTAATAGC CCAAAAOUVT CTCCTATCAA TATTGRTGRA GRTCTTACAC 
AAGTAAATGT QAATCTTAAG AAACTTAAAT TTCAGGGTTG GGATAAAACS^ TCATTGGAAA 
ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACCGTG 
TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 
AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGAC AAAAA TTTCCACTTa 
AGATGCAAAT CTACTGCTTT GATGOSGACC GATTTTCAAG TTTTGAGGftA GCAGTCAAAG 
GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 
ATTTCAAAGC GATTATTGAT GOAGTCGAAA GTGTTAGTCX3 TTTTGGGAAG CAGGCTGCTT 
TAOATCCATT CATACTCTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACaTTTACA 
ATX3GCTCATT QACATCTCCT CCCTGCACAG ACACAGTTQA CTGGATTGTT TTTAAAGATA 
CAGTTAGCaT CTCTGAAAGC CAGTTGGCTG TTTTTrGTGA AGTTCTTACA ATGCAACAAT 
CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCGAfiAGCAA CAGTACAAGT 
TCTCTAGACA GOTGTTTTCC TCATACACTG GAAAGGAAGA GAITCATGAA GCaWSTTTGTA 
GTTCAGAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACCAGCCTT CTTGTTACA.T 
GGGAAAGACC TCX3AGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGTT TTSTACCAGC 
AGTTGGATGG AGAGGACX3UV ACCAAGCATG AATTrTTGAC AOATOGCTAT CAAQACnGG 
CTGCTATTCT CAATAATTTG CTACCCAATA- TGAGTTATGT TCTTCAGATA GTAGCCATAT 
GCACTAATGG CTTATATQGA AAATACAGCG ACCAACTGAT TGTCGACATG CCTACTGATA 
ATCXTTGAACT TGATCanTTC CCTQAATTAA TTQGAACTGA AGAAATAATC AAGGAGGAGG 
AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC AGTGCTACAA 
ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT OSCATAGGGA 
CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAQ AQGAAGTGAA TTCTCTGGAA 
AGGGTGATGT TCCCAATACA TCTTTAAATT CCaCTTCCCA ACCAGTCACT AAATTAGCCA 
CAGAAAAAGA TATTTCCTTG ACTTCTCAGA CTGTGACTGA ACTGCCACCT C3VCACTGTGO 
AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT OC»CATATGA 
ACTTGTCGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAOAATAT GAGGAGGAGA 
GTTTATTGAC OVGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCAGTCCCG 
CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCX3 
AAAACCCAGA GACAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 
AAGATTCAAC TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGQGAAATG 
TGTGGTTTCC TAGCTCTACA OACATAACAfi CACaGCCCGA TGTTGGATCA GGCAGAGAGA 
GCTTTCTCCA GACTAATTAC ACTGa«WtAC GTGTTGATGA ATCTGAGAAO ACAACCAAGT 
CCTTTTCTGC AGGCCCAGTG ATGTCACAGG QTCCCTCAOT TACAG ATCTO GAAATQCCAC 
ATTATTCTAC CTTTGCCTRC TTCCCAACTG RGGTAACACC TCATGCTTTT ACCCCATCCT 
CCAGACAACA GGATTTGGTC TCCAOSGTCA ACGTGGTATA CTCGCAGACA ACCCAACCGG 
TATACAATGA GGCCAGTAAT AGTAGCCATG AGTCTCGTAT TGGTCTAGCT GAGGGGTTGG 
AATCOSAGAA GAAGGCAGTT ATACCCXTTTG TGATCGTGTC AGCCCTGACT rrTATCTCTC 
TAGTGGTTCT TGTGGGTATT CTCATCTACT GGAGGAAATG CTTCCAGACT GCACACTTTT 
ACTTAGAGGA CAGTACATCC CCTAeAGTTA TATCCACACC TCCAACACCT ATCTTTCCAA 
TTTCAGATGA TGTOGQAGCA ATTCCAATAA AGCACTTTCC AAAGCATGTT GCAGATTTAC 
ATGCAAOTAG TGGGTTTACT GAAOAATTTG AGA<aCTaA& AGAGTTTTAC CAGGAAGTGC 
AGAGCTOTAC TOTTOACTTA GCnTATTACAG CAGACAOCTC CAACCAO^CA QACAACAAGC 
ACAAGAATCX3 ATACATAAAT ATOGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCaVCAGC 
TTGCTGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 
ACAGACCAAA AGCTTATATT GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGAITTCT 
GGAGAATOAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 
AAGGAAGGAG AAAATGTGAT CAGTACTGGC CTGCOGATGG GAGT6AGGAG TACGGGAACT 
TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 
TAAGAAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CXSTGTGGTCA 
CRCAGTATCA CTACACGCAG TGaCCTOACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 
TGACCTTTGT GAGAAAGGCA GCCTATGCCA AOCGCCATGC AGTeGGGCCT GTTGTCGTCC 
ACrOCAOTGC TGGAGTTGaA ASAACaCGCa^ CATATATTGT GCTAGACAGT ATGTTGCAGC 
AGATTCAACA CX3AAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA 
GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 
CCyVTACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCXTTAT CTTAATGCAC 
TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTaAGO: 
AGTOUIATAT ACAGCAGAGX GACTATTCTQ CAGCCCTAAA GCAATGCAAC AGGOAAAAGA 
ATOGAACTTC TTCTATCATC CXrTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG 
GAGAAGGCAC AOACTACATC AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT 
TC»TCATTAC CCASCACCCI CTOCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 
ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGQCCA AAACATGGCA GAAGATGAAT 
TTQTTTACTG GCCAAATAAA GATGAGCCTA TAAATTGTGA GACCTTTAAO OT CACT CTTA 
TGGCTCAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TMARTTCAO OACmATCT 
TAGAAGCTAC ACAGQATGAT TATGTACTTG AAGTGAGGCA CT TTCAO TCT CCTAAATCGC 
CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAOAAGAAG 
CTGCCAATAG GQATGGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG AGGGCAGGAA 
CTTTCIOIGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 
ACCAGGIAGC CAAGATOATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT 
ATCAertTCT CTACRAAGTQ ATCCTOVGCC TTGTGAGCAC AAGGCAGGAA GA6AATCCAT 
CCACCTCTCT GGACAGTAAT GGTGCAGCAT TGCCT6ATGQ AAATATAGCT GAGAGCTTAG 
AGTCTTTAGT TTAACaCAGA AAGGGGTGGG GGGACTCACA TCTGAGC3VTT GTTTTCCTCT 
TCCTAAAATT AGGCAGGAAA ATCAOTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTQ 
ACAGTAACTT TCATGACATA GGATTCT G CC GCCAAATTTA TATCATTAAC AATQTGTaCC 
TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATG ATTQA ATTTTACAGT 
ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGrATTGA TTTTAACAGA AA ATTTCAA T 
TTATAGAGGT TAGOAATTCC AAACTACAGA AAATQTTTGT TTTTAGTGTC AAATTTTTAG 
CICTATTTOT AQCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 
AATAAAACAC TCTTOCATAT OATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 
OAAATAATCT GTIACTTATr GTAAATACTG CCXTTAGTCTC TCCATGGACC AAATTTATAT 
TTATAATTGT AQATTTTTA.T ATTTTACXAC TGAGTCAAGT TTTC TAGTrC TGTCTAMTC 
TTTASnTAA T6ACGTMSXT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 



PCTAJS02/12476 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 

2160 
2220 
2280 
2340 
2400 
2460 
2520 
2S80 
2640 
2700 

2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3S40 
3600 
3660 
3720 
3780 
3840 
3900 
3960 

4080 
4140 
4200 
4260 
4320 
4380 

4S0O 
4560 
4620 
4680 
4740 
4800 
4860 
4920 

5040 

5160 



261 
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TGTGTTACCT AAGTCATTAA CTTTGTTTCA GCATQTAATT TTAACTTTTG TGGAAAATAG 
AAATACCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAAC». CCTTACCAAA CATTGTTCAA 



5 
10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



31 



MVFKASKITF HWGKCNMSSD 
ID 



PCTDTVDWIV 



PBLIGTEBII 
TNRSPTRGSE 
IIDGSKTVIJIS 
ISENISQGYI 



ZPLVIVSALT 
IPIKHFPXHV 
IVAXDHSRVK 
MVEVIVMITN 



FKDTVSISES 
AVCSSEPENV 
QDLGAIUINIj 
KEBEEGKDIE 
FSGKGDVPNT 
PHMNLSGTAE 
FSSBKPBTIT 
GRESFUJTNV 
TPSSRQQDLV 
FICLVVLVGI 
ADLHASSGPT 
LAQLAEKDGK 



fplemqiycf dadrfssfee 
qaaijOpfill nllpnstdky 
mqqsgyvmlm dylqhnfreq 

LVTWERPRW VDTMIEKFAV 
VAICTDGIiYG KYSDQLIVDM 
SATNQIRKKE PQISTTTHYN 
KXiATEKDISIi TSQTVTEIiPF 
EEESLLTSFK LDTQAEDSSG 



MSQGPSVTDL 



RWTQYHYTQ 
MLQQIQHEGT 
VNALLIPGPA 



PVERSRVGXS 
VMIPDGQNMA 
YVLEVRHFQC 



GAALPDGNIA 



EDEPVYWPMK 
PKHPNPDSPI 
VDVYQVAKKI 
ESLBSIiV 



GSEHSLEGQK 
GVESVSRFGK 
QIAVFCEVLT 
QADPENYTSL 
1>PNMSYVI,QI 
EGAIVNPGRO 
SliNSTSQPVT 
SLNTVSITEY 
YDVIiIPBSAR 

TEIRVDESEK TTKSFSAGPV 
STVNWYSQT TQPVYNEASN 
IjIYWHKCFQT AHFlfLEDSXS 
QEVQSCTVDIi 

DGmRPKAYI AAQGPUCSTA 
VCafPLVTQKS VQVLAVYTVR 
LPVLTFVRKA AYAKRHAVaP 
VNIFOFLKHI RSQRKYLVQT EBQYVFIHDT 
GKTKLEKQFQ I.I.SQSNIQQS DYSAALKQCN 
MASYIMGYYQ SNEFIITQHP IiLHTIKDFMR 
DEPINCESFK 
SKTFELISVl 
Nt<MRPGVFAD lEQYQFLYKV ILSIiVSTRQB 



S220 
5280 
5340 



LYQQLDGEDQ 240 
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RIGTKYMEAK 
HTVEGTSASL 
SSPATSAIPF 
EOIVWFPSST 



PRVISTPPTP 




Seq ID HQs 187 DHA sequence 

Nucleic' Acid Accession »= EOS sequence 

Coding Beq:uence: 148-4 632 



3 CACGCAOQAT CTCACTTOSA 



CQGCGAGGOa C0GCAGAC06 TCTGOAAATC 
GTGTTTQCCQ 
AGATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 
GAGTTTCAGA 



CTTGrTGAAG 



CAAGTAAATG 
AACACATTCA 
GTCAGCGGAG 
AAATGCAATA 
GA6ATGCAAA 



6ATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 
TGGGAAA6AC 
CaGTTGGATG 
GGTGCTATTC 
TGCACTAATG 
AATCXnXSAAC 



TCTACTGCTT 
AQWAAGAGC 
CGATTATTGA 
TCATACTGTT 



TCATGCTGAT 
AGGTQTTTTC 
CAGAAAATGT 
CTCXJAGTCGT 



AACCAAATCA 
AOQAAATACA 
AA6GGT0ATG 



TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTQA 



GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TGGGAAAACA 
AATGGTGTTT 
TGGATCAGAG 
TGATGCGGAC 
TTTATCCATT 
TGGAGTCX3AA 
GAACCTTCTG 
TCCCTOCACA 
CCAGTT6GCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 

GCTACCCAAT 
AAAATACAGC 
CCCTGAATTA 



TCTATACACT 
CTCrCCACTC 
CXSAATCCTAA 
OCTAAIGGAT 



GGAGGATTAA 



AAG6TTTCCT 



51 

1 

AACAAACAAA 60 

AG GAGC CGCA 120 

OGCTTGCATT 180 

ACAGAGAAAA 240 

TTGGGGAAAG 300 

AGATCTTACA 360 

ATCATTGGAA 420 

TGACTACCGT 480 

TCACTGGGGA 540 

ATTTCCACTT 600 

AGCAGTCAAA 660 

AGAAAATTTC 720 

GCAGGCTGCT 780 

TTACATTTAC 840 

TTTTAAAGAT 90 0 

AATGCAACAA 960 

CAAAACAATT TTCaAQAGCA AOUJXACAAG 1020 



TTGTTTGAGG 
AGTGTTAGTC 
CCAAACTCAA 



ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 
GTTTTGAGGA 
TTGGGRCAGR 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGAXTGT 



GOAAAGGAAG AGATTCATQA 
CCAGAGAATT ATACCA6CCT 
ATGATTGAGA AGTTTGCAGT 
GAATTTTTGA CAGATGGCTA 
ATGW3TTATG TTCTTCAGAT 
TTGTCGACAT 
AAQAAATAAT 
CTGGTAGAGA 
CACACTACAA 



AGCASTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TCAAGACTTG 
AGTAGCCATA 
GCCTACTGAT 
CAAGGAGQAG 
CAGTGCTACA 



.GAAG6TACTT 
A ACTTO ICGO 
AGTTTATTGA 
GCAACTTCTG 



AQCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TCCAGACAAC 
GTATACAATC 
GAATCCOAGA 
CTAGTGGTTC 



TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAOA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAOCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
AGGCCAGTAA 



ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCT6AG 
ATATGATGTC 



TCCCCAACAA 
TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCC 



VTTCTCTGGA 



TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TGAiGGAGGAG 
CTCCAGTCCC 
ATTTTCCTCC 



\ TCACTAAAOG f 



ATTTCAQATG 
CAXGCAA8TA 
CAOAGCTGTA 



TTGTGGGTAT 
ACAOTACATC 
ATOTCGGAGC 
GTGGSTTTAC 
CTGTTQACTT 



AATTCCAATA 
TGAAGAATTT 
AGGTATTACA 



CGTOTTGATQ 
GGTCCCTCAG 
GAGGTAACAC 
AACGTGGTAT 
GAGTCrCGTA 
GTGATCGTGT 
TGGAGGAAAT 
ATATCCACAC 
AAGCACTTTC 



AATCTQAGAA 
TTACAG ATCT 
CTCATGCTTT 
ACTCGCAGAC 
TTGGTCTAGC 
CAGCCXTTGAC 
GCTTCCAGAC 



GACAAOCAA6 
GGAAATGCCA 
TACCCCaTCC 
AACCCAACCG 



TTTTATCTGT 
TGCACACTTT 
TATCTTTCCA 



1320 
1380 
1440 



1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 

2580 
2640 
2700 



GCA6ACAGCT CCAAOCACCX: AGACAACAA6 2820 



262 



10 



25 
30 
35 
40 
45 
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CACAAGAATC GATACATAAA 
CTTGCTGAAA AQGATGGCAA 
AACAGACCAA AAGCTTATAT 
TGGAGAATGA TATGGGAACA. 
AAA6GAA6GA OAAAATGTGA 
TTTCTGGTCA CTCAOAAGAG 
CTAAGAAACA CAAAAATAAA 
ACACAGTATC ACTACACGCA 
CTQACCTTTG TGAGAAAQGC 
CACTGCAGTG CTGGAGTTGG 
CAGATTCAAC ACGAAGGAAC 
AGAAATTATT TGGTACAAAC 
GCCATACTTA GTAAAGAAAC 
CTCCTCATrC CTGGACCAGC 
CTGTCACCCA GGCTGGAGTG 
GGCTTAACra ATCCTCCTAC 
TCAAATATAC AGCAGAGTGA 
CGAACTTCTT CTATCATCCC 
GAAGGC3VCAG ACTACATCAA 
ATCATTACCC AGCACCCTCT 
CATAATGCCC AACTGGTGGT 
GTTTACTGGC CAAATAAAGA 
GCTGAACAAC ACAAATGTCT 
GAAGCTACAC AGOATQATTA 
AATCCAGATA GCCCXTVTTAG 
GOCAATAGGG ATGGGCCTAT 
TTCTGTGCTC TOACAACCCT 
CAGGTAGCCA AGATGATCAA 
CAGTTTCTCT ACAAAQTCAT 
ACCTCTCTGG ACAOTAATQO 
TCTTTAGTTT AACACAOAAA 
CTAAAATTAG GCAGGAAAAT 
AGTAACTTTC ATGACATAGG 
TTTGCAAGAC TTGTAATTTA 
TTCTAAGAAT GGAATTGTGG 
ATAGAG6TTA GGAATTCCAA 
GTATTTGTAG CAATTATCAG 
TAAAACACTC TTCCATA.TGA 
AATAATCTGT TACTTATTGT 
ATAATTGTAG ATTTTTATAT 
TAGTTTAATG ACGTAGTTCA 
TCTTACXn-AA GTCATTAACT 
ATACCTTCAT TTTGAAAGAA 
GGTTTTTATC CAAGGAATTG 



TATCGTTGCC 
ACTGACTGAT 
TGCTQCCCAA 



TATGATCATA 
TATATCAATG 
GGCX:CACTaA 



TCAGTACTGG 
TGTGCAAGT6 
AAAGGGCrCC 
GTGGCCTGAC 
AGCCTATGCC 



CCTGCCQATQ 



CAGAAAGGAA 
ATGGGAGTAC 
AAGCGCCATO 



GCAGGOTTAA 
CCAATTATGT 
AATCCACAGC 
TGATAACAAA 
GGAGTOAGOA 
ATACTGTGAG 
GACCCAGTGG 
CAGAGTACrC 



TGTCAACATA 



TGAGGTGCTG 



CAGAGGCACA 
CTCAGCCrCC 
CTATTCTGCA 
TGTGGAAAGA 
TGCCTCCTAT 
CCTTC3VTACC 



TTTGQCTTCT 
TATGTCTTCA 
GACAGTCATA 
AAGCTAGAGA 
ATCTCGGCTC 
CGAGTGGCTG 
GCCCTAAAGC 
TCAAGGGTTG 



TGCTAGACA6 
TAAAACACAT 
TTCATGATAC 
TTCATGCCTA 
AACAATTCCA 
ACTGCAACCT 
GGACTATACT 
AATGCAACAG 
GCATTTCATC 
ATTACCAGAQ 



GTACGOGAAC 
GAATTTTACT 
ACGTGTGGTC 
CCTGCCAGTG 
TGTTGTCGTC 
TATGTTGCAQ 



ACTGGTTGAG 
TGTTAATGCA 

GGGTCTCACT 
TCCTCTCCCT 
CCTGAGCCAG 
GGAAAAGAAT 
CCTGAGTGQA 
CAATQAATTC 



2880 
2940 
3000 
3060 
3120 
3180 

3300 
3360 
3420 
3480 
3S40 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
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TGAGCCTATA 
ATCTAATGAG 
TGTACTTGAA 
TAAAACTTTT 
GATTGTTCAT 
TATGCACCauV 
TCTGATGAG6 



GATGGCX^AA 
AATTaTOAOA 
GAAAAACTTA 
GTGAGGCACT 
GAACTTATAA 
GATGAGCATG 
CTAGAAAAAG 



GCTTTAAQGT 
TAATTCAGGA 
TTCAGTGTCC 
GTGTTATAAA 
GAGGAGTGAC 
AAAATTCCOT 
TTGCTGACAT 



CCTCAGCXTTT GTGGGCACAA GOCAGSAAOA 
ATATAGCIQA 
TQAOCATTQT 
TTGATTTCCC 
TCATTAACAA 
ATGATTGAAT 
TTAACA6AAA 
TTAGTGTCAA 
TTAATACAGT 
CAGTATTCAC 
CATB GACCAA 
TCTASTTCT6 
AGTTTTCTGA 
AACTTTTOTO 



AGATGAATTT 
CACTCTTATQ 
CTTTATCTTA 
TAAATGGCCA 
AGAAGAAGCT 
GGCAGGAACT 
GGATGrrTAC 
TGAGCAGTAT 
GAATCCATCC 



GACTCACATC 
CAGTCTAGTT CTGTTATCTQ 
ATTCTGCXS3C CAAATTTATA 
CTTATTATGT TTGAACTAAA 
TATTTTTTTC TOTATTGATT 
ACTAC3W3AAA ATGTTTGTTT 
GTTTGCTAGA AATATAACTT 



TTTCCTCTTC 
ATCACCTGAC 
TGTGTGCCTT 
TTTACAOTAT 
ATTTCAATTT 
ATTTTTAGCT 
AGCCTOTAAA 



TTAGCTCQIC 
TTGTrrCAGC 
GTTTTTATGA 
CAAAAATAAA 



CTAGTGTCTC 
AGTCAA6TTT 
TTACTCTACC 
ATGTAATTTT 
GAATAACACC 
TATAAATATT 



GCCATTAAAA ;i 



4440 
4500 
4560 
4620 
4680 
4740 



4980 
S040 
SlOO 
5160 
5220 
5280 
5340 
S400 
S4S0 



50 

55 
60 
65 
70 
75 



MRILKHFLAC 
QSPINIOKDL 
PKASKITFHW 



EMISQGYIPS 



EEGGKDIBEG 
GKCroVPNTSL 
MHLSGTAESL 
SENPETITYD 
ESPI.QTMYTE 
SSRQQDIiVST 
CLVVLVGILI 



21 
I 

3 WANGYYRQQR 
j KFQGWDKTSL 
EHSIiEGQKFP 
ESVSRFGKQA 
AVPCEVLTMQ 
DPENYTSliV 
NMSYVUjrVA 
AIVNPGROSA 
NSTSQPVTKL 
NTVSITEICEE 
VLIPESARNA 
IRVDESEKTT 
VNWYSQTTQ 
YWRKCFQXAH 



31 
I 

KLVEEIGWSY 
ENTFIHNTGX 
LEMQIYCPDA 
ALDPFIIiIiNL 
QSGYVMLMDY 
rWERPRWYD 
ICTNGliYGKy 
TNQIHKKEPQ 
ATBKDISLTS 
ESLIiTSFKLO 



TGAUJQKNWG 
TVEINLTNDY 
DRFSSFBEAV 
I.PNSTDKYYI 
LONNFREQQY 
TMIEKPAVLY 
aDQIiIVDMPT 
ISTTTHYNRI 
QTVTELPPHT 



51 

I 

KKYPTOJSPK 
RVSGGVSBW 

YMGSI.TSPPC 
KFSSQVFaSY 



VEGTSASUID 540 



GlYIVLQSKb 
LOSRIBAYVN 
SRVAGTILLS 
YIMOYYQSHE 



QLAEKDGKLT DYIKANYVDG 
EKORRKCDQY WPADGSEEYQ 
VTQYHYTQWP DMGVPEYSLP 
QQIQHEGTVN IFGFI>KHIRS 
ALLIPGPAGK TKIiEKQFQC>L 
QSNIQQSDYS AALKQQIREK 
FIITQHFUJI TIKDFWHMIW 



TGAEDSSaSS PATSAIPFIS 600 

SEDSTSSGSB BSUCDPSMEQ HVWFPSSTDI 660 
KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 
PVYNEASNSS HBSRXGLABG LESEKKAVIP 780 
FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 
VQSCTVDLGI TADSSNHPDN KHKNRYINIV 
YMRPKAYIAA QGPLKSTAED FVrRMIWEKNV 
NFLVTQKSVQ VLAYVTVHHF TUUITKIKKG 
VLTFVRKAAY AKRHAVQFVV VHCBASVGRT 
QWm.VaTKE QYVFIHDTLV EAILSKBTKV 
TLSPRLECRO TISAHCNLPL PGLTDPPTSA 



RPGVFADIBQ V 



E IfATQDDYVI. BVRREQCPKH PMPDSPISKT 
3 TFCALTTLMH QI.BKEN3VOV YQVAKHIHUI 
? STSLDSMGAA LPOCHIAESL ESI.V 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



Seq ID NO: 189 DMA sequence 
80 Nucleic Acid Accession ft: MM_002820 
Coding sequence i 304.. 831 

1 11 21 31 41 51 

85 CCGGTTCGCA AAGAAGCTGA CTTCAGAGGG GGAAACTTTC TTCTTTTAGG AGGCGQTTAO 
CCCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTO CTATGGQBGA 
COTGTAAACA CACTACTTAT CaTTGATGCA TATATAAAAC CATTTTATTT TCGCtATTAT 



263 



wo 02/086443 PCT/US02/12476 

TTCAGAQOAA QCGCCTCTGA TTTGTTTCTT TTTTCCCTTT TTOCTCTTTC TGGCTGTGTG 240 

GITTGGAGAA. AOCACAGITG GAGTAGCGGQ TTQCTAAATA A6TCCOGAGC GCGAGOGGAG 300 

ACGATGCAGC GGAGACTGGT TCAGCAGTGQ AGCGTCGOSG TGTTCCTGCT GAGCTACGCQ 360 

GTGCCCTCCT GCGGGCGCTC GOTGQAGOOT CTCAQCCGCC GCCTCAAAAG AGCTGTGTCT 420 

5 GAAOVTCAGC TCCTCCATGA CAAGGGGAAG TCCATCC3UkG ATTTACGGCG ACGATTCTTC 480 

CTTCACCATC TGATCGCAGA AATCX»CACA GCTQAAATCA GAGCTACCTC GGAGGTGTCC 540 

CCTAACTCXaV AGCCCTCTCX: CAACACAAAG AACCACCCCG TCCGATTTGG GTCTGATGAT SOO 

GAOGGCAGAT ACCTAACTCA GGAAACTAAC AAGGTGGAGA CGTACAAAGA GCAGCCGCTC 660 

AAOACACCTQ GQAAGAAAAA GAAAGGCAAG CCCX3GGAAAC GCAAGGAGCA GGAAAAGAAA 720 

10 AAACX3GCQAA CTCX3CTCTGC CTGOTTAGAC TCTOOAGTQA CTQGGAGTGG GCTAGAAGGG 780 

GACCACCTGT CTQACACCTC CACAA06TCG CTGOAGCTOQ ATTCAOGOTA ACAGGCTTCT 840 

CTGGCCCX3TA GCCTCA60SG GaTOCTCTCR GCTGGGTTTT OGAGCCTCCC TTCTGCCTTG 900 

GCTTGGACAA ACCTAGAATT TTCTCCCTTT ATGTATCTCT ATCGATTGTG TAGCAATTGA 960 

CAGAGAATAA CTCAGAATAT TGTCTGCCTT AAAGCAGTAC CCCCCTACCA CRCACACCCC 1020 

15 TGTCCTCCAG CACCATAGAG AQGCGCTAGA GCCCATTCCT CTTTCTCCAC CGTCACCCRA 1080 

CATCAATCCT TTACCACTCT ACCAAATAAT TTCATATTCA AGCTTCAGAA GCTAGTGACC 1140 

ATCTTCATAA TTTGCTGGAG AAGTGTATTT CTTCCCCTTA CTCTCACaCC TGGGCAAACT 1200 

TTCTTCAGTQ TTTTTCATTT CTTACGTTCT TTC3VCTTCAA GQGAC3AATAT AGAAGCATTT 1260 

GATATTATCT ACAAACACTO CA<»ACAGCA TCATGTCS.TA AACGATTCTO AOOCATTCAC 1320. 

20 ACTTTTTATT TAATTAAATG TATTTAATTA AATCTCAAAT TTATTTTAAT QTAAMAACT 1380 

TAAATTATGT TTTAAACACA TGCCTTAAAT TTGTTTAATT AAATTTAACT CTGGTTTCTA 1440 

CCAGCTCATA CRAAATAAAT GGTTTCTGAA AATGTTTAAG TATTAACTTA CAAGQATATA 1500 

GGTTTTTCTC ATGTATCm TTGTTCaVTTG GCAAQATGAA ATAATTTTTC TAGGGTAATG 1560 

CCGTAGGAAA AATAAAACTT CACATTTAAA AAAAA 

25 
30 
35 
40 



I OT_ooaaii 



I I I I I I 

MQSfiliVQQWS VAVFI.LSYAV PSCGRSVBGI. SRRUOIAVSB HQLUIDKGKS IQDUUUtFFIi 
HHIjIAEIHTA BIRATSEVSP NSKFSFNTiar KFVRFQSDDE GRYLTQEianC VBTYKEQPLK 
TPGKKKXGKP GKRKEQEKKK RSTRSAWLOS GVTQSGIiEaD HLSDTSTTSL EIJ3SH 

Seg ID NO: 19.1 ONA sequence 
Nucleic Acid Accession «: XM_0S9328 
Coding sequence I 52.. 1023 " 



11 21 31 41 SI 

I I I I I I 

GGGCTGTCCG GCCCACTCCC CTGGGAGCGC GAGCGGTGGA CCCAGGCGGC CATGTCCCGC 60 

CCTCGCATGC GCCTGGTGGT CACCGCXSGAC GACTTTGGTT ACTGCCOGCG ACGCQATQAG 120 

45 GGTATCGTGG AGGCCTTTCT GGCCGGGGCT GTGACCAGCG TGTCCCTCCT GGTCAAOSGT 180 

GCGGCCACQG AGAGCGCGGC GGAGCTGGCC CGCAGGCaCA GCATCCCCAC GGGCCTCCAC 240 

GCCRACCrGT CKGAGGGCCG CCCCXSTGGGT CCGGCCCGCC GTGGCGCCTC ATCQCTaCTC 300 

GGCCCGGAAG GCTTCTTCCT TGGCAAGATG GGATTCCGGG AGGCGGTGGC GGCCGGAGAC 360 

• GTGGATTTGC CTCAGGTGCX} GGAGGAGCTC GAGGCCCAAC TAAGCTGCTT CCGGGAGCTG 420 

50 CTGGGCAGGG CCCCXIACGCA CGCGGACGGG CACCAGCACG TGCAOSTGCT CCCAOGCQTG 480 

TGCCAGGTGT TCGCCQAGGC GCTGCAGGCC TATGGGGTGC GCTTTACBCX} ACTOCCGCIG 540 

GAGCGCGGT6 TGGGTGQCTG CACTTGGCTa GAGGCCCCOO GGGQTQOCIT CGCCTGCGCC GOO 

GTGGAGOGCG AGGCCCGGGC 06C0SIGGGC CCCTTCTCCC GCCACGGCCT eCGQTGGACA 660 

GACGCCTTCG TGGGCCTGAQ CACTTGOGGC CGGCACATGT CCGCTCAC06 OGTCTCOSGG 720 

55 GCCCTGGCGC GGGTCCTGGA AGGTACCCTA GCGGGCX3VCA CKCTGACAGC CGAGCTGATG 780 

GCGC31CCCCX3 GCTACCCCAG TGTGCCTCCC ACCGGCGGCT GCGGTGAAGQ CCCCGACGCT 840 

TTCTCTTGCT CTTGGGAGCG GCTGCATGAG CTGCGCGTCC TCACCXSOGCC CACGCTGCGO 900 

GCCCAGCTTG CCCAGOATGG CGTGCAGCTT TGCGCCCTCG ACOACCTGGA CTCCAAGAGG 960 

CCAGGGGAGG AGGTCCCCTG TGAGCCCACT CTGQAACCCT TCCTGGAACC CTCCXTTACTC 1020 

60 TGACCCCCTA CaOACAACCA AGCACTAATC CCCTTAGTAC CAAGAAAGQG GAGCCAGGAT 1080 

TTASTCCTGG CCCAQCCCAG AGCIGGQACC TQGAGCACGA TCTQTTOACT TC CCTOG GTA 1140 

GGACACTGCC ACCTCTGGGC TCAGGTCCTC ATGCCTCCAA ATGGCATCXA GASTTTGAGC 1200 

AGCCTTCTTG GCTGCAGGC31 QGCCTAGCCT GTGGCAGCGG OCTAGGGOOC QCASAQCATT 1260 

TGGTGCCCCT CCATGTTGCA ATGCAAACAC CTTCACCACT GGOGCASTOQ GOAaAOATGG 1320 
65 CTATATTAAT AAARTAA(3GT GTGTCTTTC 



70 

1 11 21 31 41 51 

I I I I I I 

MSRPHMRLW TADDFGYCPR RDEGIVEAFIi AGAVTSVSLL VNGAATESAA ELARHHSIPT 
GI.EAin>SEGR FVGPARRGAS SLLGFBGFFI. GKMGFREAVA AGDVDLPQVR EELiAQIiSCP 
75 RBLLGRAPTH ADGHQHVHVL PGVCQVFAEA LQAYGVRFTR LPLERGVGGC TWLEAPAHAF 
ACAVERDARA AVGPFSHHGL RWTDAFVGLS TCGRHKSAHR VSGALARVLE GTLAaiTI.TA 
BLMAHP6YP.S VPPTGQCGEG PDAFSCSWER UlELRVLTAP TLRAQLAQDG VQLCALDDI.D 
SKRPGBEVPC EPTLEPFLEP SU. 

80 seq ID HO I 1! 
Nucleic Acid 
Coding sequence; 



85 



AGQGGCGCAa GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGOAAC CTCCGCTCAQ 



264 
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AGAAGATOAA GGATATOGAC ATAGGAAAA6 ASTATATCAT CCCCAGTCCT GCSGTATAOAA IBO 

GTGTGAGGGA CAQAACCAGC ACTTCTGGOA CSC3U»|QAGA CG6T6AAGAT TCCAAGTTCA 240 

GaAOAACTCS ACCnrTGOAA TGCCAAGATO CCTTGGAAAC AGCAGCCCGA GC0QAG6GCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAOQAG CATCCCAAGG 360 

GAAAGTACX3V TCATGGCTTG AGTGCTCTGA AGCCCATCXB GACTACTTCC ARAC31CCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTT T TCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 

CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

AOSAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTQGGCC AOACGCTGCT TCCCTGCX3AA GGGTTGTQTQ GATCTTCTGC OGCACCAGQC 660 

TCATCXrrGTC CATCGTOTGC CTGATGATCA OOCAGCTGGC TG6CTTCA0T GGACCAGCCT 720 

TCATQaTOAA ACACCTCTXG GAGTATACCC AGGC3UU3U3A GTCTAACCTG CAGTACAQCT 780 

TGTTGTTAGT GCTGGCOCTC CTCCTGACGO AAAT08TGCO GTCTTGGTCX3 CTTGCACTGA 840 

CTTGGGCaVTT Ca\ATTACCX3A ACCXIGTGTCC GCTTGCGGQG GGCCATCCTA ACCATGGCAT SOO 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTQAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCRG AGARTOTTTG AGGCAGCAGC CGTTGGCAGC CTGCTGGCTG 1020 

GAGGACCCGT TGTTGCCATC TTAGGCATGA TrTATAATGT AATTATTCTG GGACCAACAG 10 BO 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTr GCATCAOGGC 1140 

TCACAOCATA TTTCAGGAGA AAATGCKTGG COQCCAOG6A TGAACGTGTC CA GAAGA TSA 1200 

ATGAAQTTCT TACTTAC»TT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AOAGTGTTCA AAAAATCCGC GAGaAGQAaC OTCGOATATT GOAAAAAGCC G60TACTTCC 1320 

A6GGTATCAC TGTGGGTGTG GCTCCCATTG TGGXGGTGAT TCCCAGOGTG GTQACCTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCGATCTGA CAGCftGCACA GGCTTTCACA GTGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAA6AACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTCCTC CCACTCGAGT ATCCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGOG6TGCT GQCAQAOCAS AAAGQCCAGC TGCTOdOGA CAaTGAGQAQ CGGOCCAGTC 1800 

OCGAAOAGGA AGAAGGCAAO CACATCCACC TaOGOCACCT OOBCTTACAG AOQACACrac 1860 

ACAGCATOGA TCTGaAGATC CAAaAOGOTA AACTQOrrGG AATCTQOOOC AaTOTaQOAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT OAOGCrTCTA OAGOGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTQG CCCAGC3VGGC CTGGATCCTC AATGCTACTC 2040 

TOAOAOACAA CATCCTGTTT GGGAAGOAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAQCGACCTG ACGGAGATTG 2160 

QAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCI 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGG6CAACCA CATCTTCAAI AGTGCTATCC 6GAAACATCT CAAGTCCAA6 ACAOTTCTOT 2340 

TTGTTACCCA CC3U3TTACAO TACCTGGTTQ ACTOTQATGA A6TGATCTTC ATQAAAGAGG 2400 

GCTGTATTAC GSAAAGAGGC ACCCATGASQ AACTQATOAA TTTAAATGST GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACOGCCAGT TGAGATCAAT TCAAAAAAGQ 2520 

AAACCAQTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAQTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTQTGCA GCTGGAAflAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGC3VT 2700 

TCCTGGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACXTTOGT 2760 

GGTTGAGTTA CIGGATCAAG CAAGGAAGCG GGAACACCAC TGTOACTaGA GGGAACQAGA 2820 

CCTCGGTGAG TGAOMSCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCM' GGCAGTCATG CTGATCCTGA AAGCCATTC6 AGGAGTTGTC TTrGTCAAGG 2940 

GCAOGCTGCG AGCTTCCTCX: OGGCTGCATG AOGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACaCG ACKCCCACAG GGAGGATTCT CaACAGGTrT TCCAAAGACA 3060 

TGGATGAAGT TGAOGTGCGG CTGCCGTTCC AGGCCX3AGAT QTTCATCC3VG AACGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAO GAGTCTTCCC GTGGTTCCTT GTGGCAQTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GQACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCIGCAC AGATACCAGG 3360 

AOCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 3420 

CrOTGOSGCT GGACCTCATC AGCATOSCCC TCATCACCAC CACGGGGCTQ ATQATCQTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAAOGGGGCT GTTCCAGTTT ACGGTCAGAC IGGCATCTGA GACAGAAGCT CGATTCAOCT 3600 

CQGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCAOCT GCCAGAATTA 3660 

AOAACAAGGC TCCCTCCCCT GACIGGCCCC AGGAGGGAGA G6TQACCTTT 6AGAACX3CAG 3720 

AGATGAGGTA C06AGAAAAC CTCCCTCnXJ TCCTAAAGAA AGTATCCTTC ACQATCAAAC* 3780 

CTAAAGAGA& GATTCGCATT GTGGGGOGGA GAGGATCaVGa GAAGTCCTCG CTGGGGRiaa 3840 

<XCTCTTCXX3 TCTGCrTOGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA .3900 

GTGATATTGO CCTTGCOGAC CTCOGAAGCA AACTCICTAT CATTCXTrCRA GAGCOGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCOCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATQC CCTGaAGAOG ACACACATGA AAGAAT8TAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCRaT GGGGGAAGGG C3VGCTCTTQT 4140 

GCATAGCTAG AGCCCTGCTC OGCCACIGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

OCATGGACAC AQASACA6AC TTATTGATTC AAGAGACCAT CGGAGAAGCA TTTGC3U3ACT 4260 

GTACCATGCT OAOCATTGCX: CATCCXCTQC ACACGGTTCr AGGCTCOSAT AGQATTATGG 4320 

TOCTGOCCCA GG8ACAG6TG aTaGAOTTTQ ACAOCCCATC GGTCCTTCTO TCCAAOQACA 4380 

GTTCCOGATT CTATGCCATO TTJGCrUtJTO CAGAGAACAA GGTCJaCTGTC AAGGGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATOG CGTCCTCCTA CCGAAACCTT GCCTTTCTOG ATTTTATCTT TCGCACAGCaV 4560 

GTTCOGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATCTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAOAGGC CTATAATGAA GCTTTATACO TGTA GCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATOTAAOCT GTTTAT yri'A 4800 

TATTAAAATA AGCACCGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGIACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTQCTArP AGACTStASa AAGABTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCACG aTGCCAGGTT TTCTGGGTGT CCAAAGGAAQ AOGTGTGGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCXac AGCOSCTCCA GGGGTGGCTG 5040 

GAGAOGGGTG GGCGGCTGGA GACCATGCAG AGGGCCGTGA GTTCTCAGGG CTCCTGCCTT SlOO 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAOAQCAGC GGGGCGAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTQCCT TCTTCTTTTT GCTGTTGITT CTAAACAAGA ATCAOTCTAT CCACAQAGAG 5280 

TCCCACTGOC TCAGGTTCCT ATGGCTGGCC ACXGCACAGA OCTCTCC3«5C TCCAAGACCT 5340 
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GTTGGTTCCA AGCCCTC3GAG CCRACTGCTG CTTTrTGAGG TGGCACTTTT TCftTTTOCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCrOkCGATT TCGTGGGTCT GTTTTCCTTT 5460 

CTCACCX3CAG TCGTCGCACR GTCTCTCTCT CTCTCTCCCC TCRAAOTCTG CAACTTTAAG SS20 

CAGCTCTTGC TAATCACFTGT CTCAC3VCK5Q COTAGAAGTT TTTBTACTGT AAAGAIS^CCT 5580 

™t, ^.-nwiKM^ n-mrmm-i-nz aTtvrcnrrmc GCAAACC9CCC TTTGTGCTGT 5640 

5700 



CAGCTCTTGC TAATCACFTGT CTCACACK5Q CGTAGAAGTT TTTBTACTGT AAAGAIS^CCT 
ACCTCAGGTT QCTOOTTGCT GTGTGGTTTG GTQTQTTCCC GCAAACCCCC TTTGTGCTGT 
GGGaCTGGTA GCTCAGGTGG GCGTG6TCAC TGCTGTCATC AQTTOAATGG TCAGCSTTQC 
ATGTCGTGAC CAACTAOACA TrCTGTOGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAQ 
CAAAAATCIG AAAATGTQAA TAAAATTATT TTOaATTTTQ TAAAAAAAAA AAAAAAAAAA 



5760 
5820 



1 11 21 31 41 51 

ilKDIDIGKEY IIPSPGYSSV RBRTSTSGTH RDRBDSKPRR TRPLEOSMU. ETAASAEGLS 
l,nASMIISQI.R IIJJEEHPKGK YHHai.SAMCP IRTTSKHQHP VDHAGLPSCM TFSWLSSLAR 
VAHKKGELSM EDVWSLSKHE SSDVNCRRLB RLUQBEUIBV GSDAASLBaV VWIFCH.TBI.I 
MIVCLMITQ LAGPSGPAFM VKHIiI.BYTQA TBSNLQYSUt. LVUSIALTEI VHSWSXALTW 
- ..T.^nu-rms t,<nym.rmn aanrmatttnaL aAVOSUJkGG 



AUnfRTGVRI, RGAILTMAFK KILKLKNIKB KSLGBLINIC 8NDG(»MPEA AAVOSLMGG 300 

PWAIliGMIY 1IVHU3PTGP LGSAVFILFY PAMMFASRI.T AYPRRKCWAA TDBRVQKMNE 360 

VLTYIKFIKM YAWVKAPSQS VQKIRBBERR ILEKAGYFQG ITVGVAPIW VIASWTPSV 420 

HMTIiOFDIiTA AQAFTWTVF NSMTFAI.KVT PFSVKSI.SEA SVAV PRPK SL PLMEEVHMIK 480 

UKPASPHIKI EMKNATIAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV HQWJRTEHQA S40 

VIAEQRGHU. LDSDERPSPB EEEGKHIHLG HLRLQRTLHS IDLEIQEGKI. VGICGSVGSQ 600 

KTSLISAIIiG QMTLLEGSIA ISGTFAYVAO QAWILNATLR DNILFGKEYP EEHYNS;^ 660 

CCLRPDLAIL PSSDI.TEIQE RQAMLSGGQR QRISLARALY SDRSIYIMD PLSM^AOTG 720 

KHIFNSAIRK HLKSKTVMV THQLQYLVDC DBVIPMKEGC rTBHGIHEEI. MnJTODYATI 780 

ENNIiLLGETP PVEINSKKET SOSQKKSQDK GPKTGSVKKB KAVKPBBGQIi VQLBEKGQGa B40 

VPWSVTfGVYl QAAGGPLAFI. VIMALEMLNV GSTAFSTWWI. SYWIKQGSOI TTVTRGNETS 900 

VSDSMKDKPH MOYYASIYAI. SMAVMLILKA IRGWFVKGT LRASSRIHDE LFRRILRSPM 960 

KFP0TTPTGR lUIRFSKOHD EVDVRLPPQA EMPIQHVILV PPCVGMIAGV PPWFLVAVGP 1020 

LVILFSVLHI VSRVLIRELK RI.DNITQSPF LSHITSSIQG LATIHAYNKG QEFLHRYQEL 1080 

U3DNQAPFFI. FTCAMRWLAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG lAISYAVQLT 1140 

GLFQFTVRIA SBTEARFTSV ERIHHYIKTI. SI.EAPARIKN KAPSPDWPOB GEVTPEHAHJ 1200 

RYRENLPbVL KKVSPTIXPK BKIGIVGRTQ SOKSSUCaiAl, FRtVELSGGC IiaDGVRISD 1260 

IGLADLRSKL SIIPQEPVLF SQTVRSNtDP FHQYTEDQIW DALERTHMKB dAQLPLKUE 1320 

SEVMENGDNF SVGERQLra AKAL1«HCKI I.IIJ3BATAAM DTBTDI.I.IQB TIHEAPADCT 1380 
MLTXAHRLKT VUSSDRIKWI. AQGQWBPDT PSVLLSHDS8 RFYAMFAAAB HKVAVKG 



Seq ID NOi 195 DNA a . 
Nucleic Add Acceosion «« IJM_006470 
Coding sequence: 228.. 1922 



1 11 21 31 41 51 

GCTGTCCTCA GCCTGAGTAC TCTAGCTGCC TTGTCGCCAT CGCATCTGGC TGCCATCCAG 60 
CGCCAGCACA CAGTAATGAG TGGCCGAGCT TCCTCTGGGA GG GAGGAA AC AGfTTAAAATC 120 
TTGCAGCAGC TGCAATCATC TAGGCGTGGT TCTCTTGTCT GACTTGGGCT GCACA taTC C 180 
TGGGCCAAGG GACAGAAGAA AGACAGCCTA GGAGCAGAGC CTCCCAOATG GCTGAGTTeO 240 
ATCTAATGGC TCCAGGGCCA CTGCCCAGGG CCACTGCTCA GOXiaaGCC CCTCTCACCC 300 
CAGACTCTGG GTCACCCAQC CCAGATTCXG GGTCAGCCAG CCCAGTGQAA GAAGAGGAOG 360 
TGGGCTCCTC GGAGAAGCTT GGCAGGQAGA CGGAGGAACA GGACAGOSAC TCTGCAGAGC 420 
AGGGGGATCC TaCreaTGAG GGGAAAGAGG TCCTGTGTGA CTTCTGCCTT GATGACACCA 480 
(SmGABIGAA GQCAOTQAAG TCCTGTCTAA CCTGCATGGT GAATTACIGT GAAGAGCACT 540 
TGCAQCCGCA TCAGGTCAAC ATCAAACTGC AAAGCCACCT GCTGA CCGAG CCAOTGAAGG 600 
ACCACAACTG GCGATACTGC CCTGCCCACC ACAGCCCACT GTCTGCTTTC TGCTGCCCTO 660 
ATCAGCAGTG CATCTGCCAG GACT6TTGCC AGGAGCACAQ TQGCCACACC AT^CTCCC 720 
TGGATGCAGC COGCAGQGAC AAGGAGGCTa AACTCCASTG CACCCBGTTA OA CTKMAG C 780 
GGAAACTCAA GTTCAATGAA AATGCCATCT CCAGGCTCCA GGCTAACCAA AAGICTOTTC 840 
TQGTQTCGGT GTCAGAGGTC AAAGCGGXGG CTGAAATGCA GTTTGGGGAA CTCCTTGCTG 900 
CTGTGAGGAA QGCCCAGGCC AATGTGATGC TCTTCTTAGA GGAGAAGGAG CAAGCIGCGC 960 
TGAGCCAGGC CAACGQTATC AAGGCCCACC TGGAGTACAG GAGTGCCGAG ATGGAGAAGA 1020 
GCAAGCAGGA GCTGGAGAGG ATGGCGGCCA TCAGCRACAC TCTCCAGTrC TTGGAGGAGT 1080 
ACTGCAAQTT TAAGAACACT GAAGACATCA CCTTCCCTAG TQTTTACGTA GGGCTGAAGG 1140 
ATAAACTCTC GGGCATCOGC AAAGTTATCA CGGAATCCAC TGTACACTTA ATCCAGTTGC 1200 
TGGAGAACTA TARGAAAAAG CTCCAQOWST TTTCCAAGGA AGAGGAQTAT GACATCAGAA 1260 
CTCAAGTGTC TGCOGTTGTT CAGCGCAAAT ATTGQACTrC CAAACCTGAG CCCAGCAMA 1320 
GGCSAACAGTT CCTCCAATAT GOGTATGACA TCAOQTTTGA CCOGGACACA OCACACAAGT 1380 
ATCXCCGGCt GCAGGAGGAG A1WX3GCAAGG TCACCAACAC CAOGCCCTGG GAGCMCCCT 1440 
AGCCGGACCT CCCCAGCAGG TTCCTGCACT GGCGGCAGOT QCTGTCCCAG CAGAGTCTGT 1500 
ACCTGCACAG GTACTATTTT GAQGTGGAGA TCTTCGGGGC AGGCACCTAT GTTGGCCTGA 1560 
CCTGCAAAGG CATCGACCGG AAAGQGGAGG AGCGCAACAG TTGCATTTCC GGAAACAACT 1620 
TCTCCTGGAG CCTCCAATGG AACGGGAAGG AGTTCACGGC CTGGTACAGT GACATGGAGA 1680 
CCCCACTCAA AGCTGGCCCT TTCCGGAGGC TCGGGGTCTA TATCGACTTC C^GAGGGA 1740 
TCCTTTCCTT CTATGGCQTA aAGTAIGATA CCATGACTCT GGTTCACAAG TTTGCCTGCA 1800 
AATTTTCAGA ACCAGICTAT GCTGCCTTCT GGCTTTOCAA GAAGQAAAAC GCCATCOGGA 1860 
TTGTAGATCT GGGAGAQGAA CCOGftGAAGC CAGCACOGTC CTTGQGGQTQ ACTGCTCCCT 1920 
AGACTCCAGG AGCCATATCC CAQACCTTTG OCAQCXACAa TQRTGGGATT TGCATTTTAQ 1980 
. . .i.m/..M»«><n nm^nr'nvmn •pqopws&K&Tr CTATGGG6TC 2040 



AGACTCCAGG AGCCATATCC CAGACCTTTQ (.-<,;iKii.-i»uiu 

GGTGATTTGT GGGCAGAAAT AACTGCTGAT GGTAGCTGGC TTTTGAAATC CT^GGGGTC 2040 

TCTGAATGAA AACATTCTCC AGCTGCTCTC TTTTGCTCCA TATGGTGCTG TTCTCTATGT 2100 

GTTTGCAGTA ATTCTTTTTT TTTTTTTTGA GACGGAGTCT OGCACTGTTG CCCAGGCrGG 2160 

AGAGCAGTGG CGCGATCTTG GCTCACTGCA AGCTCOGCCT CCCGAGTTCA AGCAATTCTC 2220 

CTGCCTCAGC CTCCCGAGTA GCTGGGATTA CAGGTGCCTG CCACCACAOC CBGCraMGT 2280 

1TTGTATTTT TAOTAQAflAT QQQQTTTCAC CATGTTGGCC AGGCAGATCI CAAACTCCTG 2340 
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ACCTCGTGAT GCACCCACCT CGQCCTCCCA ABQTGCTQGG ATTACATCCXJ TOAaCCMrrQ 2400 
CGCCCTGCCr GTTTGTAGrA ATTTTTAGGC ACCAAATCIC CCTCATCTTC TAOTGCCATT 2460 
CTCCrCTCTG TTCAGOTAAA TCTCRCACTG TGCCCA6AA7 GGATGACCAG GAACCTTAAA 2S20 
GAGTGGCTGA AAAGATtGCA GA6TTATCAT AATAAATTGC TAACTTGOGT 



Seq ID NO> 196 Protein sequence! 
Protein Accession ff: NF_006461 

1 11 21 31 41 51 

I I I I I I 

rXAELDLMAPQ PLFRATAQPP APLSPDSGSP SPDSGSASPV EEEDVQSSEK LGKBIEEQDS 60 

DSAEQGDPAG EGKEVIiCDPC LDDTRRVKAV KSCLTCMVMY CEEHLQPHQV NIKLQSBUiT 120 

EPVKDHNWBY CPAHHSPLSA PCCPDQQCIC QDCCQEHSC» TIVSLDAAHR DKEAELQCTQ 180 

UJIjKRKLKLH ENAISRI.QAN QKSVLVSVSE VKAVAEMQFG ELIAAVRKAQ ANVMLFIjEEK 240 

BQAALSQANG IKAHLEYRSA BMBKSKQEIiB RMAAISNTVQ FIiBBYCKFRII TEOITFPSVY 300 

VGLKDKLSGI RKWITESTVH I1IQU.ENYKK KIiQEFSKEEB VDIRTQVSAV VQHKYWTSKP 360 

BPSTREQPLQ YAYDITFDPD TARKYLRLQE ESRICVTOTTP NEHPYPOIiPS RFIiHWRQVIiS 420 

QQSI>yi£RYy FEVGIPOAGT YVGIiTCKGID RRGEERNSCI 5GNNP6WSLQ HNGKEFTAHY 480 

SDMETPI.KRG PFRSLQVYID FPGGIIiSFyO VBYDTMTLVH KFACKPSEPV YAAFWLSKKB 540 
HAIBIVDL6E BPEKPAPSLG VTAP 

Seq ID HO I 197 DNA sequence 
Nucleic Acid Accession #: NM_004316 
Coding sequence: 433-1149 

1 .. 11 21 31 41 51 

I I I i I I 

CCGQAGACCC GOCGCAAQAG AGGGCAGOCT TAGTAGQM3A GQhACGOGAG AOGCGGCAGA 60 

GGGCGTTCAa CACIQACTTT TOCTQCTOCT TCTGCTTTTT TTTTTCTTAG AAACAAGAAG 130 

GCXJCCAGCGG CAGCCTCACA OGCQAOCiGCC ACGCGAGGCT CCCOAAGCCA ACOOGCGAAG 180 

GQAGQAGGGG AGGGAGGAGG AGGCGGCGTG CAGGGAGGAG ARAAAGCATT TTCACCTTTT 240 

TTGCrcCCAC TCTAAGAAGT OTCCCGGGGA TTTTGTATAT ATTTTTTAAC TTCCGTCAGG 300 

GCTCCCGCTT CRTATTTCCT TTTCTTTCCC TCTCTGTTCC TGCRCCC3U«3 TTCTCTCTQT 360 

QTCCCCCTCG CGGGCCCCGC ACCTCGCGTC CCGGATCGCT CIGRTTCOSC GACTCCTTGG 420 

CCGCCGCTGC GCATGGAAAG CTCTGCCAAG ATGOAGAGOG GCGG CGCC GG CCAGCAGCCC 460 

■ CAGCCGCAGC CCCAGCAGCC CTTCCTGCCG CCCGCAGCCT GTTTCTTTGC CACX3QCCX3CA 540 

OCGGCGGCQQ COGCaUSCCGC 06CM»»QCA GCaCAGAGGa CGCAGCAOCA GCAQCASCAG 600 

CAGCAGCAGC AGCAGCAGCA 6CA6GCGOC6 CStGCTQMSAC OQGCQGCXXW 0Q6CCAGCCC 660 

TCAGGGGGCG GTCRCRAGTC AGOGCCCAAG CAAGTCAAGC GACAGCQCTC GTCTTCGCCC 720 

GAACTGATGC GCTGCaAACG CCGGCTCAAC TTCAGCGGCT TTGGCTACAG CCTGCCGCAG 780 

CAGCAGCCGG CCGCCGTGGC GCGCCGCAAC GAGCGC3GAGC GCAACCGCGT CAAGTTQGTC 840 

AACCTGGGCT TTGCCACCCT TCX3GGAGCAC GTCCCCRACG GCGCGGCCAA CAAGAAXSATG 900 

AGTAAGGTGG AGACACTGCG CTCGGOQGfrC GAOTACATCC GCGCGCTGCA GCAGCTGCTG 960 

GACGJU3CATG ACQCGGTGAG CGCCGCCTTC C3U3GCRGGa3 TCCTGTCX3CC CACCATCTCC 1020 

CCCAACTACT CCRACGACTT GAACICCAIQ GCCGGCTCGC CGGTCTCATC OTACTCGTCG 1080 

GAGGAGG6CT CTTAGBACCC GCTCRSCXXX: QRGGAGCAGG AGCTTCTOGA CTTCACCAAC 1140 

TGGTTCTQAG GGGCTOGGCC TGOTCAGGCX: CTGGTGCGAA TGGACTTTCG AAGCAGGGTG 1200 

ATCGCACAAC CTGCATCTTT AGTOCrrTCT TGTCAGTGGC GTTGGGAGGG GGAGRAAAGG 1260 

AAAAGAAAAA AAAASAAGAA GAAGAAGAAA AQAGAAGAAG AAAAAAAOGA AAACAGTCAA 1320 

CCAACKCCAT CGCCAACTAA GOGAGGCATG CCTGAGA6AC ATGGCTTTCA GAAAACGaQA 1380 

AGCGCTCAGA ACAGTATCTT TGCACTCCRA TCRTTCACGG AGATATOABG AGCAACIGOa 1440 

ACCTGAGTCA ATGCGCAAAA TGCAGCTTGT 6TGCAAAAGC AGTGGGCTCC TGQCRGflAOq 1500 

GAGCAGCACA CX3CGTTATAG TAACTCCCAT CACCTCTAAC ACGOkCAQCT OAAASTTCTT 1560 

GCTCGGGTCC CTTCACCTCC CCGCCCTTTC TTASAGTGCA GTTCTTAGCC CTCTAGAAAC 1620 
GAGTTGGTGT CTTTC 



Seq ID NOs 19B Protein sequence: 
Protein Accession ff: IIP_004307 

1 11 21 31 41 51 

I I t I i I 

MBSSAXMESG GAGQQPQPQP QQPFI.PPAAC FFATAAAAAA AAAAAAAQSA QQQQgQQQQQ 60 
QQQQAPQUIP AADGQPSGGQ KKSAPRQVKR QRSSSPBUOl CKRRLNFSQF GYSLPgQQPA 120 
AVARRNERER HRVXLVHIiGF ATIiRBHVPNO AANKXHSKVB TLRSAVEYIR AIQQUUDEHD 180 
AVSAAPQAGV LSFTISPMYS NOLHSMAOSP VSSYSSDEGS YDPLSPEEQE LLDFTNHF 

Seq ID NO: 199 DNA seqMence 
Nucleic Acid Accession Hi im_007015 
Coding sequence: 1-lOOS 

1 11 21 31 41 51 

I I I I I I 

ATGACAGAGA ACTCCGACAA AGTTCCCATT GCCCTGGTGG GACCTGATGA CGTGGAATTC 60 

TGCAGCCCCC COGCGTACGC TAOGCTGACG GTGAAGCCCT CCAGCCCCGC GCGGCTGCTC 120 

AAGGTGGGAG COGTOGTCCT CATTTCGGGA GCTGTGCTGC TGCTCTTTGG GGCCATOQGG 180 

GCCTTCTACT TCTGOAAGGG OBGCaAaVGT CACATTTACR ATOTCCATTA CACGA TGAOT 240 

ATCAATGSGA AACCACAAiSA TSGSTCAATG aAAATAQAOS CIGOOAACAA CTTGOAOACC 300 

TTTAAAATGG GAAGtGGAGC TGAAGAASCA ATTGeAOTTA ATOATTTCCA QAATGGCAXC 360 

ACAGGAATTC GTTTTGCTG6 AGGAGAGAAS TGCTACATTA AAGCGCAAGT GAAGGCTCGT 420 

ATTCCTGAGG TGGGCQCCGT GACCAAACAG AGCATCTCCT OCAAACTGGA AGGCAAGATC 480 

ATGCCAGTCA AATATGAAGA AAATTCTCTT ATCTGGGTGG CTGTAGATCA GCCTGTGftAG 540 

GACAACAGCT TCTTGAGTTC TAAGGTGTTA GAACTCTGCG GTGACCTTCC TATTTTCrGG 600 

CTTAAACCAA CCTATCCAAA AGAAATCCRG AGGGAAAGAA QAGAAGTGGT AAGAAAAATT 660 

GTTCCAACTA CCRCAAAAAG ACXaWSVCAGT QGACCACGGA GCA ACCC AGG CGCTGGAAGA 720 

CIGAATAATG AAAOCAGACC CAGIGTTCAA GAGGACTCAC AAGCCTTCAA TCCTGATAAT 780 
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0 
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CCTTMCATC AGCAGGAAGG GGAAAGCATG ACRTTCGACC CTftGACTGOA TCROQARGQA 840 

ATCTGTTGTA TAGAATGTAG GOGGAGCTAC ACCCACTGCC AGAAGATCT6 TGAACCCCTG 900 

GGQGGCTATT ACCCATGGCC TTATAATTAT CAAGGCTOCC GTTOCSGCCTG CftOftG TCATC 960 

ATGCXXTGTA GCTGGTGGQT GGCCCGTATC TTGGGCATGG TGTGAAATCA CTTOVTATAT 1020 

CACGTGCTCT AAAATAAGAA CTAGCTQAAG AGACAACCAA AGAAGCATTA AGGC3U3GTTG 1080 

ATGCTOATGO OACCRTAAAA TATTTTTACA CX3CAGCCTGA GCGOTTATTC TTGAC3VCTCT 1140 
TAAC3USAATT TTTTTAATOS TTTTCC3VGAA CTTTAGTATA TGCAAATGCA CTGAAAGGGT: 
AGTTCAAGTC TAAAATGCCA TAACCCCGTT ATrTGTTATT TTTTATTTGC ATT6ATTTGC 

cataagtctt cccttgcttg catcttccaa agctatttcg aaataaacac gaaaatttac 
agtttgcc 



1260 
1320 



I 



I 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



MTENSDKVPI ALVQPDDVEF CSPPAYATLT VKPSSPARLL KVGAWLISG AVU.LFGAIG 
APYPWKG9DS HIVNVHYTMS INGKLQDGSM EIDAGMIJIOT FKMOSGREEA lAVNDFQNGI 
TCIRFAGGEK CYIKAQVKAR IPEVGAVTKQ SISSKLBQKI MPVWtB EWSI. IWVAVDQPVK 
DNSFLSSKVL ELCGDLPIFW LiCPTYPKEIQ REHHBWRKI VPTTTKRPHS SPRSNPGABR 
UMETRPSVQ EDSQAFNPDN PYKQQEGESM TFDPRIiDHBS ICCIECRRSY THOQKICBPL 
GGXyPWPWY QGCRSACRVI MPCSWWVARI U3MV 

Seg ID HOt 201 DMA sequence 

Nucleic Acid AcceGSion «: HM_000728.2 

Coding sequence: 112.. 495 



OTAATAAGAG 
OTCGA COGQC 
CGOAAGTTCT 



GAGCTGAAGC 
AACACTGCCA 
GTGAAGAGCA 
GACCTTCAAG 
CATATCCITA 
AAGGAGQCAC 
TGGAAGAAGA 
GAGAATARTT 
GGAAACTAAT 
GGTTATTTGG 



TGGATGCAAG 
TTGCTTTTTC 
TCX»ATTCAT 
TTCC CTAACT 
TATAGTTTTA 
TGAGAGSTGT 
TTTGTTAAAA 



ACCATTCTTT 
ACCaVQATAAT 
CTAAAATATT 
TTCTGATGAG 
CATATTAATA 
GTTTTCTCTG 
ATATCTTSTT 



CGQGQTCTCC 
CQCTCGCGCT 
CCCCCTTCCT 
CATTCAGGTC 
GCCTCCTGCT 
AGGAGCAGGA 
CCTGTGTGAC 
ACTTCGTGGC 
CCTQAGCAQA 
TAAGAGATTC 
AAGCCAAGGA 
GCAGCCCTGC 
TCTGTTGTTT 
ACAATACATT 
AAAQTOTGTA 

QTATCTCATT 
TTACCATATS 
TGGCTTGCTT 
ATTGTTTTCA 
ATTTTCTTAG 
CTTTTTTTTT 
AAGGTCXXAA 
TATTTTATAT 
AGQITGAAAT 
AQACTGTTAT 
TIQTGTQGGT 



GCCCTGAAAC 
GGCTCTCAGT 
TGCCCTGGAG 
GGCTGCACTC 



31 

1 

CGCXXaCAGC 
TCTAOTCQCC 
ATCTTGGTCC 
AGCAQCCCAG 
6TGCAGGACT 
TCCAGCTCCG 



TGTACCAGGC 
ACCCGGCCAC 
ATGTQC3VGAT 



51 
I 

TTCATCCCGG 
CATGGGTTTC 
GGGCAGCCTC 
ACTCAGTAAA 
GAAGGCCA6T 

GAGAGcxnrac 



TCATOOGCTO GCAOGCITGC 1 



TOAATQACTC 
ACICAQAAGA 
AGTCTGTGTC 
TQACACCTAG 
TAAGCCACRA 
TTCATTTATT 
TTTAACTCTG 
TGCCAGCCAC 



T ACCAG AAGC 
AGTTTGGACT 
AGTTTGTGGT 



AATTTGTTAT 
GCCTTGGAGT 
CCAAACTATT 
TCTAGTATTT 



TGTTGAGTTT 
TTGGAAACTT 
CACAG AGAAA 
TTGTGCTTTT 
ACCTT ATTCT 
QATCTATTTT 
QAATATAQAT 



TTASAGCTCT 



TGCCAATQTC 
QTGGGTCTAC 
TTCTACATCT 
ATTTTTAATG 
ATATTAAGTC 



GCTGTTTTAA TTTOTATTTC CCCA ATOA CT 
TTTATCACCT 
GCTTTATTAG 
GATATATAGT 
CAGTGTCTCT 
CTTTTATGTA 
GGTCACAATA 
GTAGATTAGT 
TCATACCTGT 
TTCACCATTT 
ATATTTCIGO 



ATACTGCxrrr 

CAACATTQTT 
TTTATACATT 
GGATTGTGTT 
GTTCAATTCA 
TTTAACA6TG 
CTATTTTATT 
ATTGCTAfiTA 



GTTCTCARTT 
GATTAGTSTA 
CATTCTTGTT 
TTAGRATCAG 
AAATCAGTGG 
TGAAOVCAAT 
TTCTCRGTTT 



CCTTL 

AAGTTOTAAT 
AGTTC3VTOXC 
ATACTTTCTT 
OAGTTAATTr 
ACCCAATTGT 
GCACCTTTGT 
CTGTCtCATT 
6TOTTAAAGT 
CAAAAAOATT 
TGTGTTACTA 
GTTAATTTTQ 
ACATGTTTTC 



AATCXaATQA 
TGACAGAG6C 
GAACAGTCTC 
A6AACTGTGA 
GACAGCCCTA 
GGGATTGCTG 
TTCTGAAGTG 
CCACaVAATAG 
AATGACGTTG 
ATCTTCTGCT 
TTATATQTTG 
AATCTGOGGA 
TTGAATAAQA 
TAAGAACTCT 
GTAAAA6TTT 
TTGTATAABG 
TTCAGTaCX» 
CAAAAAGCAA 
GATTGATTTG 
GAATCTCAAA 
TTAGCTACAT 
TCTACAAAAT 



ACTTATTTAQ 
TATTCXACAC 
GTACTTAAAC 



1020 
1080 
1140 
1200 
1260 
1330 
1380 
1440 
1500 

1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



70 
75 
80 
85 



1 i 1 I I 

MGFHKPSPFL ALSILVLYQA GSLQAAPFRS ALESSPDPAT 
KASEUOQEQB T0S8SSAAQK RACNTATCVT URLAGLLSRS 



LSKEDARLLI. AALVQDYVQM 



Seq lO H0< 203 DNA sequence 
Htaclelc Acid Accession A: IIM_001741 
. 71.. 496 



21 



11 



CTCXGGCTOG ACGCCGCCGC CGCCGCTGCC ACCGCCTCTG KTCOMXCf. (XTXCCOCCt. 
OASAGGTerC ATGGGCTTCC AAAAGTTCTC CKCCTTCCTG GCTCTC3VGCA TCTTGGTCCT 
QTtaCftGOCA GGCAGOCTCC ATGCAGCACK ATTCAGGTCT GCCCTGGAGA GCAGCXCA6C 
AGAOCOOaCC ACQCTCAQTQ AGGACQAAGC GCGCCTCCIQ CrOOCTGCAC TGGTGCAGGA 
CTAXCTGCAQ ATGAAGGCCA GTGKGCtOGA GCAGGAGCAA GAGAOftflAGQ GCTCCAGCCT 



268 



55 
60 



I I I.I I I 

MGFQKFSPFIj AliSIIjVIiLQA GSLBAAPFRS AI.ESSFADFA TIiSEDBARIiL IJUU.VQDYVQ 
MKASEI.EQEQ ERBGSSIiDSP RSKRCXaiLST CMLGTYTQDF NKFHTFPQTA IGVGAPGKKR 



WO 02/086443 PCTAJS02/12476 

GGACAGCCCC AGATCTAAGC GGTGOGGTAA TCTGAGTACT TGCATGCTGG GCACATACAC 3fiO 
GCAGGACrrC AACAACTTTC ACACXSTTCXX: CCAAACTGCA ATTGCXSGTTG GAGCACCTGG 420 
AAAGAAAAGG GATATGTCCA GCXaCTTGQR GASftfiACCaT CQCCCTCATG TTAGCATGCC 480 
CCAGAATGCC AACTAAACTC CTCCCTTTCC TTCCTAATTT CCCTTCTTGC ATCCTTCCTA 540 
5 TAACTTGATG CATGTGGTTT GGTTCCTCTC TGGTGGCTCT TIGGGCTGGT ATTCGTGGCT 600 
TTCCTTGTGG CAGAGGATQT CTCAAACTTC AOATGGGAGQ AAAGAGAGCA G6ACTCACAG 660 
BTTGGAAGAQ AATCACCTG6 GAAAATACCA GAAAATGAGO GOCGCTTTGA GTCCCCCAGA 720 
GATGTCATCA GAQCTCCTCT GTOCTGCTTC TOAATBTGCT GATCATTTGA GOAATAAAAT 780 
TATTTTTCCC C 

15 
20 

Seg ID NO: 205 DNA sequence 
Nucleic Acid Accession «: NH_005361 
Coding sequence: 1-94S 

25 1 11 21 31 41 51 

I I I I I I 

ATGCCTCTTG AGCAGAGGAG TCAGCACTGC AAGCCTGAAQ AAGGCCTTC5A GGCCCGAGGA 

GAGGCCCTGG GCCTGGTGGG TGOGCAGGCT CCTGCTACTG AGGAGCAGCA GACCGCTTCT 

TCCTCTTCTA CTCTAGTGGA AGTTACCCTG GGGGAGGTGC CTGCTGCCGA CTCACCGAGT 
30 CCTCCCCACA GTCCTCAGGG AGCXTTCCAGC TTCTOGACTA CCATCAACTA CACTCTTTGG 

AGACAATCCG ATSAaOGCTC CAGCAACCAA GAAGAGGAGG GGCCAAGAAT GTTTCCCGAC 

CTGGAQTCCG AGTrCCAAQC AGC3MVTCAGT AGOAAGATGG TTOAOTTOGT TCATTTTCTG 

CTCCTCAAGT ATCGAGCCAQ OGAGCCQOTC ACAAAGGCAG AAXTQCTGOA QAOTOTOCTC 

AGAAATTGCC AGGACTTCTT TCCCGTGATC TTCAOCAAAG CCTCCOAOTA CTTQCAaCTG 
35 GTCTTTGGCA TCGAGGTGGT GGAAGTGGTC CCCATCApCC ACTTGTACAT CCITOTCaCC 

TGCCTGGGCC TCTCCTACGA TGGCCTGCTG GQCQACAATC AOGTCATQCC CAAGACAGQC 

CTCCTGATAA TCGTCCTGGC CATAATCGCa ATAGAGGGOG ACTGTGCCCC TGAGGAGAAA 

ATCTGGGAGa AGCTGAGTAT GTTGC3AGGTG TTTGAGGGGA GGGAGGACAG TGTCTTCGCA 

CATCCCAGGA AGCTGCTCAT GCAAOATCTG GTGCAGQAAA ACTACCTGGA GXACCGGCAG 
40 GTGCCCGGCA GTGATCCTGC ATGCTACGAG TTCCTGTGGG GTCCAAGGGC CCTCATTGAA 

ACCAQCTATQ TGAAAGTCCT GCACCATACA CTAAAGATCX3 GTGGAGAACC TCACATTTCC 

TACCCACCCX: TGCATGAACG GGCTTTGAGA GAGGGAGAAO AGTQA 

45 

1 11 21 31 41 51 

I I I I I i 

MPI.EQRSQHC KPEEGLEARG EALGLVGAQA PATEEQQTAS SSSTLVEVTL GEVPAADSPS 
50 PPHSFQGASS FSTTINYTIiW RQSDEGSSHQ EEEOPRMFPD LESBFQAAIS RKMVELVHFL 
I>UCyBASEFV TKAEMIfSVI. DNOOOFFFVI FSKASSyi<QI. VFGIEWEW PISHLVXI.VT 



B TSYVKVIiBIIT IiKIOOBPHZS 300 
yPPLHERAIA SGEE 



60 



GGCACOGCCC TTAGGAGGGC CACCCTCAGA GTCTOACAGC AG8TQAAGGT CCTAAATCTC 

, _ CCCAAACTAA CTGGTGTCTT TTCTCCTCTT CCAAGATGCT CTTCCOGAGG GAGATGCTAG 180 

65 CCCTTTGGOT CXnTACCTCC TGCCCTCAGQ AGCCCCGGAG AGAGGCAGTC CTGGCAAAGA 240 

GCACCCTGAA GAGAGAGTGG TAACAGCGCC CCCCAGTTCC TCACAGTCGG CGGAAGTGCT 300 

GGGCGAGCTG GrGCTOGATG GGACCGCACC CTCTGCACaT CAOGACATCC CAGCCCTGTC 360 

ACOGCTGCTT CCAGAGGAGG CCCGCCCCAA GCACGCCTTG CCCCXX31AGA AGAAACTGCC 420 

TTGGCTCAAG CAGGTGAACT CIGCCAGGAA GCAGCTGAGG CCCAAGGCCA CCTCXX3C3W5C 480 

70 CKCtataOA AGGGCAGGQI CCCAGCCAGC GTCCCAGGGC CTAGATCTCC TCTCCTCCrC 540 

CACGOAGAAO CCTGGCCCAC CGGGGQACCC GGACCCCATC GTGGCXH'CGG AGGAGGCATC 600 

AQAAOrGOCC CTTTaGCTGQ ACOGAAAGQA OAQTaCGGTC CCTACAACAC CCGCACCKCT 660 

GCAAATCTCC CCCTTCACTT CQCAGCCCTA TGTGGCTCAC ACACTCXXXTC AGAGGCCAQA 720 

ACCCGGGOAQ CCTGGGCCTG ACATOGCCCA GGAGGCCCCC CAGGAGGACA CCAGCCCCAT 780 

75 GGCCCTGATG GACAAAGGTG AGAATGAGCT GACTGGGTCA GCCTCSVGAGG AGAGCCAGGA 840 

GACCACTACC TCCACX^iTTA TCACCACCAC GGTCATCACC ACCGAGCAGG CACCa«3CTCT 900 

CTGCAGTaiG AGCTTCTCCa ATCCTGAOGG GTAC3VTTGAC TCCaGOBACT ACCCACTGCT 960 

OCCCCTCAAC AACTTTCTOa AOTGCACajA CAACGTOACA GTCTAC^CTG GCTATGGGGT 1020 

G6ASCTCCAG GTOAAOAGTO TGAACCTGTC CGATGGGGAA CTGCTCTCCA TCCJCOXSGGT 1080 

80 GGAGQGCCCT ACCCTOACOS TCCTGGCCRA CCAGACACTC CTGGTGGAGG GGCAGGTAAT 1140 

CCX3AAGCXXC ACCAACACCA ICTCCXSTCrA CTTCCGGACC TTCCAGGACG ACGGCCTTGG 1200 

OACCTTCCAG CTTCACTACC AGGCCTTOVT GCTGAGCTOC AACTTTCCCC GCOGGCCTGA 1260 

CTCTGGGGAT GTCACGGTGA TGGACCTGCA CTCAGGTGGG GXGQCCXACr TTCACTQCCA 1320 

CCTGGGCTAT GAGCTCCAGG GCGCTAAGAT GCTGACATGC AICAATGCCT CCAASCOBCn 1380 

85 CTGOAGCAQC CAGGAQCCCA TCTGCTCAGC TCCTTGK36A GOGGCASTGC ACAATQCC3VC 1440 

CATOGGCCGC GTOCTCTCCC CAAGTTACC!C TQAAAACACA AA TGBG AOCC AATTCIGCaT ISOO 

CTGGACGATT GAAGCTCCAG AGGGCCAGAA GCTGCACCTG CACTTTGAGA GGCTOTTGCT 1560 



269 



GCATG&CMU3 GA^CAOGAtGA aQOTTCACMS GQQGCAGACC AACMGTCAG CTCTTCTCTA 1620 

CGACTOOCTT CAAACCGASA CI6TCCCTTT TGAGGGCCTG CTGAGCGAAG GCAACACCAT 1680 

CCGCATOGAQ TTCACGItXJG ACCAGGCCCX! GGOSGCCrCC ACCTTCAACA TCCGATTTGA 1740 

AGCGTTTQAG AAAGGCCACT GCTATGAGCC CTACATCCAG AATGGGAACT TCACTACATC 1800 

Ca3A02CGACC TATAACATTG GGACTATAGT GGAGTTCACC TGCGACCXKG GCCACTCCCT 1860 

QGAGCAGGQC CCGGCCATCA TCGAATGCAT CAATGTGCGG GACCCATACT GGAATGACAC 1920 

AGAGCrCCTG TGCAGAGCCA TGTGTGGTGG GGAGCTCTCT GCTGTGGCTG GGGTGGTATT 1980 

GTCCCCAAAC TGGCCCGAGC CSrTACGTGGA AGGT6AAGAT TGTATCTGGA AGATCCA06T 2040 

GGGAGAAGAG AAACGGATCT TCTTAGATAT CCAGTTCCTG AATCTGAGCA ACAGTCACAT 2100 

CTTOACCATC TACGATGGOG ACGAGGTCAT GCCCX»CATC TTGGGGCAOT ACCTTGGGAA 2160 

CAGTGGCCCC CAQAAACTGT ACTCXn-CCAC GCX»GACTTA ACCATCCAGT TCCATTCGGA 2220 

CCCTGCTOGC CTCATCTTTG GAAAGGGCCA GGGATTTATC ATGAACTACA TAGAGGTATC 2280 

AAGGAATGAC TCCTGCTCGG ATTTACXXWA GATCCAGAAT GGCTGGAAAA CCACTTCTCA 2340 

CACGGAGTTG GTGCGGGGAG CCAGAATCAC CTACCAGTGT GACCCCGGCT ATGAC»TOGT 2400 

GGGGAGTGAC ACCCTCaVCCT GCCAGTGGGA CCTCAaCTGG AQCAGCXJACC CCCCATTTTG 2460 

TGAGAAAATT ATGTACTGCA CCGACCCCGG AGAGGTGGAT CACTCGACCC GCTTAATTTC 2520 

GGATCCTGTQ CTGCTGGTGG GGACC3VCCAT CC3VATACACC TGCakACCCCG GTTTTaTeCT 2580 

TGAAGGGAGT TCTCTTCTGA CXTTGCTACAG CCGTQAAACA GGQACTCCCA TCTQGAOQTC 2640 

TCGCCTGCCC CACTGCX3TTT CAGAAGCGGC AGCAGAGACG TCGCTGGAAO GGGGGAAC3VT 2-700 

GGCCCT6GCT ATCTTCATCC COGTCCTCAT CATCTCCTTA CTGCTGGGAG GAGCCTACAT 2760 

TTACATCACA AGATGTCOCT ACTATTCCAA CCTCCQCCTG CCTCTGATOT ACTCCCACXZC 2820 

CTACAGCCAG ATCACOGTGG AAACCXSAGTT TGACAACCCC ATTTACGAGA CAGGGGGAAC 2860 

OCAAAAGGTT TAGGGTTTCA TTTAAAAAOA GGTAOCXTTTT AAAAAGGGGC TTGTGAACTC 2940 

AACCCCAATT TCCCCOASAC ATTTATCCAA AGGCCXTTOGG GGCCTTGATT TAAACCCCCA 3000 

AAAlGGOGGCT GTTTTTTQGT TAAACTTTTT AACAAAQGGT TACGGGTTTT TTCCCXXraAT 3060 
TTTATAAATT TTAAAASTO 



Seq ID nOt 208 Protein aequence: 
Protein Accession #> NP_066938 

1 11 21 31 41 SI 

I I I I I I 

MAQEAPQBDT SPMALMDKGE HEtTCSASEE SQETTTSTIl TTTVITTEQA FALCSVSFSH 60 

PEGYIDSSDV PLiPIJOIFLB CTTOVTWTG VGVEI.QVKSV NLSKGELLSI HGVDGPTliTV 120 

liANOTLLVEO QVIRSPTKTI SVYFRTFQDD CTjGTFQUIYQ AFMLSCMFPR RPDS6DVTVM 180 

DLHSGGVAHF HOOiGYELQG AKMLTCINAS KPHHSSQBPI CSAPC6GAVH NATIGRVIiSP 240 

SYPENTHGSQ FCIVreiEAPB GQKUHjHFER LIAJDJKDRKT VHSGQTNKSA liiYDSIflTES 300 

VPFEGIiLSEG NTIRIEPTSD QASAASTFNI RFEAPEKCHIC YEPYIQNGNF TTSDPTYNIG 360 

TIVEFTCDPG HSI.EQGPAII ECINVRDPTO NDTBPLCRAM CGGELSAVAO WLSPNWPEP 420 

YVEGEDCIMK IHVGEEKBIF LDIQFIJn.SH SDILTIYDGD EVMPHII.GQY LGNSaPOKLY 480 

SSTPDLTIQF BSDPAGLIFG KGQOFIMHYI BVSRNDSCSO LPBIQNGWKT TSBTELVRGA 540 

RITVQCSPGY DIVCSOTLTC QHDLSWSSDP PFCGXIHrCT DPGEVDKSTR LISDPVLIiVG 600 

TTIQYTOlPa FVLEOSSLLT OrSBETOTPI HTSRLPECVS EAAAETSLEG GNMAIAIFIP 660 
VtllSIiLUSa AYIYITRCRY YSHIALPUW SHPYSQITVE TBFI»rPIXET GGIQiCV 

Seq ID NOi 209 DHA aequence 

Nucleic Acid Accession #: NM_001327.1 

Coding sequence: 89-631 

1 11 21 31 41 SI 

I I I I I I 

AGCAGOGGGC GCT GTC TOTA CCOAGAATAC GAGAATACCT CGTGGGCCCT OACCTTCTCT 60 

CIGAGAGCCQ GOCAGAGGCT CCXSGAOCCAT GCAGGCCQAA GGCCGGGGCA CAGGGGGTTC 120 

GACGGGCQAT GCTGATGGCC CAOGAGGCCC TGGCATTCCT GATGGCCC3U5 GGGQCAATGC 180 

TGGCGGCCCA GGAGAGGCGG GTGCCACGGG CGGCAQAGGT CCCCBGGGCG CAGGGGCAGC 240 

AAGGGCCTCG GGGCCGGGAG GAGGCX3CCCC GOGGGGTCCG CATGGCGGCG CXSGCTTCAGG 300 

GCTGAATGGA TGCTGCAGAT GCGGGGCCAG GGGGCCGGAG AGCCGCCTGC TTGAGTTCTA 360 

CCTCGCCATQ CCTTTCGC6A CACCCATGGA AGCAQAGCTG GCCCGCAGGA GCCTGGCCCA 420 

GGATGCCCCA CCGCTTCCX3G TGCCAGGGGT GCTTCTGAAG GAGTTCACTQ TGTCCGGCAA 480 

CATACIOACT ATCCGACTQA CTGCTGCaWJA CXaw:OGCCAA CTGCauSCTCT CCATCAGCTC 540 

CTGTCTCCAa CAGCTTTCCC TGTTGATGTQ GATCACSCAG TGCTTTCTGC CXXjTGTTTTT 600 

GGCTCAQCCI CCCTCAQGGC AXSAGGCOCTA AGCCCASCCT GGCOCCCCTT CCTAGGTCAT 660 

GCCTCCTCCC CTAGGGAATQ GTCCCAGCAC GA GTOGC CAG TTCATTQTGG GGGCCTQATT 720 
GTTTaXOGCT GGAGQAGGAC GGCTTACATG TTTGTTTCTG TAiQAAAATAA AACTOAGCTA 

Seq ID NO! 210 Protein sequence: 
Protein Accession Si NP_001318.1 

1 11 21 31 41 51 

1 I I I I I 

MQAEGHQTGO STGDAIXSPGO POIFDGPGGN AGGPGEAGAT G6RGPRGAGA ARASGPGGGA 60 
PRGPHGGRAS GUIGCCRCGA RGPESRIABF VIiAMPPATPM EAELfiHRSLA QDAPPI.PVPG 120 
VIAKEFTVfiG NILTIRLTAA DHRQMLSIS SCLQQLSLLM W1TQCFI.PVF IAQPPSGQ8S 



Seq ID NOi 211 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: S2-459 

I 11 21 31 41 51 

I I I I I I 

CCTCGTGGGC CCTOACCTTC TCTCTGAOAG COGGGCAGAG GCTCOGGAGC CATGCAGGOC 60 
GAAG6CCAGG GCACAGGGGG TTCGAGGGGC OATGCTOATG GCCCAGQAGa CXXnOOCATT 120 
CXnXSATGGCC CAGGGGGCAA TGCTOGC9GGC CCAGGASAOa CXSGOIGOCAC GGGCGOCAGA 180 
GGTCCCCGGG GCGCAGGGGC AGCAAGGGCC TCGGGGCCGA GAGGMGOGC OCCGOGGGGT 240 
CCGCATGGC6 GTGCOGCTTC TGCXiCAGGAT GGAAGGTGCC CCIGCGGG6C CAGGAGQOCS 300 
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GACAGCCGCC TGCTTCAGTT CCX3ACTGACT GCTGCAGACC ACOGCCAACT GCAGCTCTCC 360 

ATCAQCTCCT GTCTCCAGCA GCTTTCCCTQ TTGATGTGGA TCACGCAGTQ CTTTCTQCCC 420 

GTGTTTTTGG CTCAGGCTCC CTCRGGGCAG AGOOQCTAAG CCCUSCCIGa CXSCCCCTTCC 480 

TAGGTCA-rac CTCCTCCCCT AGGGAATGGT CCCAQCAOGA QTQGCXAGTT CATTGIQGGQ 540 

GCCTGATTGT TTGTCGCTGG AS3RGGACGG CTTACMOTT TQTTTCTQTA GAAAATAAAQ 600 
CTGAGCTA 

Seq ID NO: 212 Pirat:ein sequence: 
Protein Accession #< Bos sequence 

1 11 21 31 41 51 

I I 1 I I I 

MQAEGQOTGG STGDAOOPGO FGIPDGFGCai AGGFQEAQAT aQROPROASA ARASOPRGOA 60 

PRGPHGGAAS AQDGRCPCGA RRPDSRIiLOP RLTAADHRQL QLSZSSCLQQ IiSIjLMHITQC 120 
PLPVFLAQRP SGQRR 

Seq ID NO: 213 OKA sequence 
Kucleic Acid Accession #: NM_000555 
Coding sequence; 416. .1498 

1 11 21 31 41 SI 

1 I I I I I 

CTTATTTTTT ATGAATGTOG GATAGCTGCA CCAGCTTGGT GGGGAAAGGG TTTGATQAAT 60 

AGCACAAAGA CACTGGCTGT TCCCTGGAGG CTGTCCCTTT AAAGQAQAAT CTTAGTTTAT 120 

TCTGGGGQGA GGGGATGCAC ACATTAGAGT AGGAAAGAGG GCTTGGAATA AAATGAAAAC 180 

ACTCCCCCTT CATAGTCATT GTACTGAAAT GCAAAGACTG CTTCCTAAGC TGGAGATGCT 240 

AACCTTGGGT AGCTCCTTCT GTTCTCTTCA AGGGGAATTT TGTCAGGCTA TGQATTCATT 300 

TACAACTGTT AGTCATGTGG GCATGTQTGA GGAAAC3M3AT GCCAGTTTTA ATGTATTTAG 360 

CCXSSAAOTTC CAATTTOATA GGAGCCACTG TCAGTCTCTG AGGTTCCACC AAAATATGGA 420 

ACTT6ATTTT GGACACTTT8 AOGAAAGAGA TAAOACATCC AGGAACATGC GAGGCTCCOG 480 

QATGAATGGG TTGOCTAGOC CCACTCACAG CGCCCACTGT AGCTTCTACC GAACCaOAAC 54 0 

CTTGCAGGCA CTGAGTAATG AGAAGAAAGC CAAGAAGGTA CGTTTCTACC GCAATGOGGA 600 

CCGCTACTTC AAGGGGATTG TGTACGCTGT GTCCTCTGAC CX5TTTTCGCA GCTTTGACGC 660 

CTTGCTGGCT GACCTGAOjC GATCTCTGTC TGACAACATC AACCTGCCTC AGGGAGTGCG 720 

TTACATTTAC ACCATTGATQ GATCCAGGAA GATCGGAAGC ATGGATGAAC TGGAGGAAGG 780 

GGAAAGCTAT GTCTGTTCCT CAGACAACTT CTTTAAAAAG GTGGAGTAC» CC3UVGAATGT 840 

OUVTCCCAAC TGGTCTGTCA AOGTAAAAAC ATCTGCCAAT ATGAAAGCCC CCXAGTCCTT 900 

GGCTA6CAGC AACAGTGCAC AGGCCAGGGA GAACAAGGAC TTTGTGCGCC CCAAGCTGGT 960 

TACCRTCATC CGCAGTGGGG T6AAGCCTCG GAAGGCTGTG CX3TGTGCTTC TGAACAAGAA 1020 

GACAGCXXAC TCTTTTGAGC AAGTCXTTCAC TQATATCACA GAAGCCATCA AACTGGAGAC 1080 

OGGGGTTGTC AAAAAACTCT ACACTCTGGR TGGAAAACAG GTAACTTGTC TCCXTGATTT 1140 

CTTTGGTGAT GATGATGTGT TTATTGCCTG TGGTCCTGAA AAATTTOGCT ATGCTC3U3GA 1300 

TGATTTTTCT CTGGATGAAA ATGAATGCCG AflTCATOAAG GGAAACCCAT CAGCCACSiOC 1260 

TGGCCCAAAG GCATCCCCAA CACCTCAGAA GACTTC3kGCC AAGAGCCCPO GTCCTATGCa 1320 

CCGAAGCAAG TCTCCAGCTG ACTCAGCAAA CGGAACCTCC AGCAGCCAGC TCTCTACCCC 1380 

CAAGTCTAAG CAGTCTCCCA TCTCTACGCC CACCAGTCCT GGCAGCCTCC GGAAGCACAA 1440 

GGACCTGTAC CTGCXrrCTGT CXTrTGOATQA CTCGGACTCG CTXGGTGATT CCATGTAAAG 1500 

GftGGGGAOAG TGCTCAQAGT CXaVGAGTACA AATCXXAGCC TATCATTGTA GTAGGGTACT 1560 

TCTGCTCAAG TCTCCAACAG GSCTATTGGT GCTTTCAAOT TTTTATTTTG TTOTrrOTTGT 1S20 

TATTTTGAAA AACACATTQT AATATGTTGG GTTTATTTTC CTGTGATTTC TCCTCTGGGC 1680 

OVCTGATOa CAGTTACCAA TTATGAOAGA TAQATTOATA ACCATCCTTT GGQGCAGCAT 1740 

TCCAGGGATG CAAAATGTGC TAGICCATGA CCITTCAATG GAAAGCTTAG GGGCCTGGGG 1800 

TAAATTT6CC CCGTTTAAAT TTCCCCAAAC AGTrTTCCTT TTGTAaAGQQ GTGTTTAAAT 1860 

ATACAGCAAT TAAAAA6TTT GTOTGGGGAA AAAAAAAACT CATTGaCAGA TCCAAGAATG 1920 

ACAAACACAA GTGCCCCTTT ICTCTGGATC TCAAGAATGG TGGAGGACCC TGGAAGQACA 1980 

GCAAGGCAGC TCCCCAGCCT CACTCTTCAC TCCTGATTGA GGCCCGGGTT TGTTGTCCAG 2040 

CACCAATTCT GGCTGTCAAT GGGGAGAAAT AAACCAACAA CTTATAATTG TGACACCA6A 2100 

TGCTTAGGAT CCTGGTGCTG GGTTAGCTAA GAGAATAGAC AGAATTGGAA AATACTQCAG 2160 

ACATTTCCQA AGAGTTTATA AAGCACAOTG AATTCCTGOT CAATCTCTCC ACTGAGGCAA 2220 

TTTGGAATCA ATAAGCAATT GATAATAOTT TOGAOTAAGS GACTTCATAr ACCTGATTCC 2280 

TCTAGAAGGC TGTCTAACAT ACCACATGAT TAC%TGAACT GTATGOTATC CAT CTATC TC 2340 

TeXTCTArrS AATGCCTTOT TAACAGCCAA CACIGAAAAC ACTaTOAOAA TTTOTTTTCA 2400 

OCTCTOACAC CTTTCAaTCT CTTTTTATAG CAAOAAATCA ATATCCTTTT TATAAAAATT 2460 

CATGTCTQTA TTTCAGGAGC AAACTCTTCA GGCTCCTTTT TTATAAACTQ GTGATTTTTC 2S20 

TTTTGTCTAA AAAACACATG AAGAAAATTT ACCAGAAAAA AAAAAAAAAG CC6AAGAATA 2580 

ATGTTATTTA GAAATTATGC TGTCACTGCC AAACAGTAAC CTCCAGGAGA AAACAAGATG 2640 

AATAGCAGAG GCCAATTCAA TAOAATCAGT TTTTTGATAG CTTTTTAACA GTTATGCTTG 2700 

CATTAATAAT TTCAAT6TGG ACC3W3ACATT CTAATTATAT TTTAAATQAA ATGTTACSVaC 2760 

ATATTTTAAO CAACTCTTTT TATCTATAAT CCTAATATTT CATACTGAAO ACACAGAAAT 2820 

CTTTCACTTG TCTTTAACAT TAGAAAGGAT TTCTCTTTAC TAAGGACTGA TCATTTQAAA 2880 

TAOTTTTCA6 TCT T TTaAOA TACA6QTTTA TAACACTQCT TTTTTTTTCC TQTAAACaTA 2940 

GCCCATAAXG GCAAAAACAA CTAATTTTAA TTGAAGGTCT TGCTTGCCAH TCCTGIGTTG 3000 

GCTTTNACCA AATATAAAAA TTCCCTTATT CCTTGGTAAT GGTGCSUVATN TTTGGAAAQG 3060 

CACAGCATCC AAACCAAGCT GCTGTTTGGC TACTGAATGG CTTGCSUSTTG TTCCTCCACT 3120 

CTAAATGGAA TGAGCTTGCT GTGTGTGTGT GTGGTGGTGG TGGGAGGGGG TGGTGCATGT 3180 

GTGTGTGTGT GTCTGCATCT GCftGCTGCTT CAAAATTAAG AAATACTACA AGACACCCCT 3240 

GTAATGGATT GGTGGCAACT GGGTQGCACT GCTGATGTGC ACTGTGTAGG GGGGAACCC» 3300 

GTGGTGGTGG GGTATCTCAA ATGCOCCTAG ACAAGCTTCA GATGTCrGTA GCTACCAAAA 3360 

ACATTTTGSG TTCAAGAAAA GTSAGATGAT GGTAGTACTO GTTTCTGGTG AAATTGAAAA 3420 

ACaXCAAATS ATOAeGATCT CTTTTTGCCC CCTCTCCTTT TTTTOTAAAC CCATTCAAAA 3480 

CC3VTTAATAA GCCCATTTTA CTAANCXXJCT ATTTCTTTCT AGAAGCTCAG GGTTTOCTTA 3540 

GTGCCTCCCA NAACATTTTG TAGTTAATTG GGAAAAAGTQ ATACTTGGAT TAGGGGGTGT 3600 

GGGCATAAAG AATGGTGGGA GGCCTGATTT TAAAATTCAG GCCAGAACCC CCAATGACTC 3660 

CACCCATAGT NTCACTTTAO GTCTCATTTA QTCXSVTCACC TTTATTTTAA GTTGAGGAAG 3720 

TGGAGGCTGG TAAAGAGCAG GACCAGAGGA AGAATCCAGA TTTCCTTATG CTTGGGCCTC 3780 

ACACTAGCTC TNTGAOTATT TCXTTTGATTG CGGTATATGT ACTACTAGAA AATACC3VAAT 3840 

GGATATATTT TCTTTAGGAT AACCTTTOAA CCAACRATNT TCAATAACAA TAQTAOVTCT 3900 
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TCCATCTTAC TTTTAATCGA GTATftAGGAA ATGTTTCTTT ATOaCCATTT TGQAGGGAGC 3960 

AGGGGATGAG GCTTGGOVTA GTCCAAAATT TAAOTCTCCA ATAATTAATT GCATTTTAAA 4020 

TTGTTrTAAA TTQGCCCACT TTCAAOGCAA TTTTTTTTOT GTGTCTGTAA CTGAGCTCCT 4080 

CCACCCCTGT CATTCACTTC CAATTTTACC CAATCCAATT TTAGCACTCA AOTTCCATTO 4140 

TGTTAATTTT TQCACOGTCT ACACACATCA AGTCAGCAAQ C»TTTGCCAC CACTCCCTAT 4200 

ACTTCTCCCT CTTTTTTACA CACACACACA CACACACACA CACAATCCAT CTCTTGCTTa 4260 

TTCCTACCTC CCTGATTTTT CTTCCCTACA GAAATAGAAA TAGGQACAAA OAAGGGGAAA 4320 

ATGTATATAT TGGGGCTGGG CTGAACAACT AACTTCATAA GTAGTATTAA CTAGGC3GTAA 4380 

ATTGAGAGAA AAGCTCCTTT TCTCTTCACT GTTTTGGAAA GGATAGCCAT TAGCATGACT 4440 

GCTTTGTGTC CTTATGGACT TTAGTATTAG CCTAQATTGA ATTATAGOST TTTTCTAGCT 4500 

GAAGGAACCT TAAGATCACA TCATCIACTC CTCTACTCXn AATTTCTCAT TCTTCAGGCC 4560 

AQGAAACCQA QACACAGAGO TAAAGTAATT TCCCXAAG6T CACACAGCTO GCTGGGGCAG 4630 

GATTGGGTTT ACAACCCACA TCTCCTGGCT CTTATTCCAG GGCCTTTTCC CaCTAAGTAO 4680 

TATTGCCTTC CATTAGGCTC CTGAGAGTTA TTTCTCAGGG TCATGTTGCA TCTTGGAGCC 4740 

ACATGCTGCT GCCCTGATCT CAGTGGGAAA TNCACCCAGC AACCTAATAC AGCCCCTTTT 4800 

CCCTGCATTC ACCTGGTTCC CATCCACATG GGTTGCAGAT GTCCTTGAAG AGAGTGAGGC 4860 

ATTGAGGGCC AATAGGAGCA ATGGGGTCCC TGGCCTTGTC CATCTGATTC AGGAGATCAC 4920 

TGCTCCATCG TQAGGAGCCC TCTGAATAGC CKCCCACTGA ATGCTTGCCT TGCCCAAATG 49B0 

GAATGOAOGA ACSATTQATTT TCTCCATCAG TTCACCTTGr GTCATCTCAT AATGGTTGGT S040 

CTTTCCAGGC TGAGGGAAAT GTTTCTTGTT TCCAHAGTAH AAAAAAGAAA GAGTGGAACA 5100 

ATAHCmm TCATCCTAAC TTTCTGA6AT GQCTTTTCAA CATTTAAAAA AAACTAGTGT 5160 

GGTACCATTC ACTGGCANGA TTTNTTTTAG AATATGGGAG TAAGAXGAGG TAGAGAAAAT 5220 

AACCTGGTCT CACTGTGGTT GCCCTCATCX: ACAATGTCCC CAAAGCCATC CTGCTOTGAT 5280 

GAGGACAATT TCCAGGTATA AGCAAQGGGC TTTOTGACAA AAATGTACCC TGGCTGATGT 5340 

TAAACATTGG CTCCTGTGTT TGCACCAAAA TAGCAAGCTG TGTGCTCTAT ACACTCTTCC 5400 

CATCX3TCTTG TGTACACTGC TCCTGTGGCC TTCCACAGCA QAAACCAGGG CAAAAGGGTC 5460 

CAAACACATG GTTTTCCTTG CTGCAAGGCT NTTCCTGGGA ACTAAGGGGG TATTTATTAG 5520 

TTCAGTTOTA AGAGACCTCC TTCTGGGCTT ACCCXIACTCC TCAGGTACTT CTCTCTCCTT SS80 

ccrecrrcTC ctccacagtc ac^agtaacc aaggaacctg aaagtggatg tgtagctatt se40 

T6AAGAAGGC AAOGAACCCT GAGATTCTTC TTTGAATCCT TTAGTCCAAG TCTTAGACCA 5700 

QTGATTGGTG CTTACCTTGA ACAAAATTTT GTCTGTGTTC CTAATCCCTT CaWVTACTNTG 5760 

GQTACRATGC TCCCAATCAC CCTGCACATT TGATTCTAAA TOGCTTTTAT TTTTTAAAAA 5820 

TCCATATCCC TAGGACAAGA NAACAGGATQ CCTATAXCCC CAAAATGAOC TCCAB6IVCAC 5880 

TGATQGQAAT GATCCCAANO ATCACCCCAC CTCAOAAAAC GTCTGTGCCA ANAGACCTCC 5940 

CCAGATAQAA NCACTGOGAC AGTGOTTXOA AQQACITCTT TTATGGTTaT CCAaTTTGCT 6000 

ATGGAAATAA AAGGC31TTGA TTTTTTAAAA AAGATGATTG GAACCTGTCT TTGGCX3kCAT 6060 
AGGGCCACTT GGATCCATTT CCAGGCCTTA CTCATATATT GCCTTCACTG AAGGGCTrra . 6120 

GCTTTAAGTC CCAGACTGGT CTCCCAAQTG AACCATAAOT GTTTTGGAac TCATCTG6GG 6180 

TGAGGCATGA GAATGTTGCC CCATCTATCC CTTCAGQRAA AOGTGCCTTC CCTCCCrTTC 6240 

TCCTAAAGCC TC3GTCCCCAA AAATTGTTTT TSTCTOCAAA AGTCTAQTAT OGTCTTTATA 6300 

CftCCCAMACT CTTAGTGTTG CGTCCTGCCT TGTTTCCTTG TTAAGGATCT ATG CAKAC CT 6360 

CCOGCTTTGa CTTAGCTAGC 6TCACATTG6 CTATCATTTG ACAAGACTAA CTTTTTTTTT 6420 

TTTTTTTTTa ACTGAGTCTC CCTCTGTCAC CTAGGCTGGA GTGCAGIGaC ACAATCTTGG 6480 

CTOGCTGCAA CCTTCACCCT TCACCTCCCA GGTOaAAaCa ATTCTCCTGC eTCAGTCTCC 6540 

aUGTAGCTG GOATTA CAGO OOTGOGCCAC tSlAATCTGGC TMTTTTTTA TTATTATTAT 6600 

TTTTA6TAGA GATOGGGTTT CACCATGTTG GCCACjACIGG TCTTGAACTC TIGQCCTCfA 6660 

ATTATCTGCC CACCTCGGCC TCCCAAAGTG CTGGGATTAC AGOCATGAGC ACCATGCCCA 6720 

GCTGACAAGA CTAATTTTTT ATCCCTTGGT TTATTGGCrr CAACATCTTC TGGAATCAGA 6780 

GQTGATTTTT TCTTACCTTG GATGCCTGAG ACTAGGGGAG TATAGAATTC CSXATTGGTAA 6840 

TTAAGGCATC TTTCTGCTCC TGATCAGAAG GGCAGGTTAG TTGGGAGAGO TCASATQGCA 6900 

CAACAGAAGT CACCTTGTAA GTAAG6CAAA GACTTTGAAG GCATTAGC6T TTCTCATTAC 6960 

TTAGOTCAAT AACCTTGAG6 6AATCAATGO CITTTTTGCC GCICTACCTC TTTGTGTATC 7020 

TCTTTGACTT TTCTTTCTCT OTCTAOTTTC CTCTGTTCTC AGTTTATAIT CTATQTTATC 7080 

AGTCTCTCTT TCCACAGCAC AAACATCCAT CCTTTCTCX^T GTGCAATTCT GTCTCZOCCT 7140 

CTTATTATCT TTATTTGTAC TTTTTCCTTC CTCCCTGTCT AGGCATTOGG CATGTGCXrrC 7200 

TTCTTAGCCT GTGATTTTGC CTTGGGACTG ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

CTGGTCCTAC CCCAGTCCAA TCAGAAGTAT GTTGGTGGGG AATCRACCTG ATCCTGGCCC 7320 

TTTCTTCTTC TCCATTTTCA TTCGTAATCC CCCTCAGCAG ATCTTTACAA GCAGTTTCCT 7380 

TATAGCTCAT GTATCTTTAG GTCTTTGCCT TCX31AGCACT GTACAGAATA CTTTGTGGTT 7440 

CKTTTTTAGT CTGftCATTTT GTGGAGCAGT GAAGC6TGCT CAGAGACATA ATCAGCTGAA 7500 

GBGAAAAAAX CCAOCCAtGG ATTTATATCA GCTAAATACT AATAATTGAT TTTGTTTGAT 7560 

OIGOCCKinA TTTTTAAAGC TGCAATATAA TATAATGAGG GACC3W»GGT AATTTCTCCT 7620 

GTCATTTGTT TTGGCTGGAT GGGGGTGGGG GAGTAATTGC TTAAAGTTTT ACCATTACAC 7680 

ATTAAACTCT CTATAATAAT CTTGTTTGGG GCTTGCrAAC TGTTGAaCTG TTTTAACTAA 7740 

ACTGGTAGGC AATC3GGAGTT GATTTAAATG AAAAGATAAT TTAACAAATC TATACTATAA 7800 

AAAGAGACAT TTGCTTAATT GACATGTATT TTTTCCTTCT GAGTCACCTA AACATTTACT 7860 

CTTGACACCA ACTGTTCATG ATACTGAATA GACAGTCXaVT ATAAGAGAAA TTAQTGGACC 7920 

TAAAGAAGCC AGATTGTAGG TGTTAATTTA TTAAACAGAA TTGC34AAGCC CTTGGAAATG 7980 

TCACTGCTTG GCAATACCAT ATGGCATGCC AAAATTTACA ATGACTTTTC TTTATAAGTT 8040 

ATCCAAAAGG GATTTGAACA AGTAAGAGGT TATGCCAAAA TGTCTCCAAT GTATGQTCCT 8100 

GTAATATATT GCAGCTTGAA GCCAATGATC CCTTATGACT TGTATACAAC TAATGCATGT 8160 

TTTATTGAAT TTTGCATTTC CCACXTrOTGG TAAGTCTTTA AAATGTTTTT GATCACCTTT 8220 

HTCTGCCATT AAACTTGTAC AGAAAATGIT TTTATGGCCA TTTTCAAAGG GAGAAAGTTT 8280 

AAAATGGAAA CAGCCCACCC TTTCraCCCX ATAGCTGTAG TTAGAATTGA GTACCTGIAO 8340 

CAAAACAGCT GTAATTGGTG GTTGTAGTGT TAGAEGTGTT A6CTTGCTA6 TGACTAGCTT 8400 

TQQAGAGTAA ATGCATGGTA TTOTACATCA CATTTCTTAA CTajTTTTAA CCTCTGAAAA 8460 

GAATATATTC TTCTTTQTAG TCCTTCTTCX: CACCCCXTTTG CCCTCTCCCT CTOCCTGCTC 8520 

COVGTTGTCT TACAGTTGTA AATATCTGAT TTGAGGCCCA ATAACTCTTG CCAAGTAAAG 8580 

TCAGCAAACA ACAAACAAAC CAAAATGTGG GGAAAAGGCA TTTCTCAACC ATCTCTCAGC 8640 

AGTTATTGAT CATTTCTTAA GGAACAGCAT TGTQATCAAA GACTCARCTT TACGTAAAAA 8700 

TCAOTGGTAA ATTGQGGTTG TATTGGCCAT TGATTACATT CAGGATTGflA TAGTTTTCAG 8760 

AATCACATGT AATCCAAAGA CAGTAGGTAG TGATGTCCCT TATCCCTGCA GCTGTTTTAA 8820 

GATAGAGACC TCAQAAGACT CTGCTTGACC GATGACCAAT AATTATTIGA AAAAAAAAOA 8880 

AAAAATGAGA GAAATAAAAC AGATATTTAA GAACTTTAGC CACCTATTTA GAATAGITAT 8940 

AGCCAGAAAA AAAAACAAG6 GCATOAGTTC AAATGCATTA CTATCASTOT CCTAGGCAAT 9000 

ACCTAACCTA CTCTGAAATT GTGATTCAAA AGCAGTATTT CAAGRGGCAT TCTOCTTTTT 9060 

TQGTTTGCTO ACCCCACTTG GACTGGTAGO TTTQGZOAGG CCCCCATAAA CCAGCTGOAG 9120 
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CAQACCCTTT TCATCTCXTTG TGCCTOTAAC ACCCCTCTTC CCCXROXCC TOCGCAATTC 9180 

AATGAGGGCT TTCTIGGGTC AGAGGACTTC MGOrraTCT AGftOAAGTIT GCCRTGTCTG 9240 

TAAGGTGCT6 TGAACTGTGA GTGCIGAAGA TTCGCAGCAT TCAATACCAQ GCAGCCAAAC 9300 

AGCTQCTCTT GCaATTATTT TGGCTCTCA& GCTCTGTrCT TCATCBOITT CTCRTTICTG 9360 
TGTACATTTG CAAGATGTGT GTAATGTCRT TTTCCftAAAA TAAAATTTGA TTTCAAT 



10 

1 11 21 31 41 51 

I ' I I I ' 

MELDPGHB'DE RDKTSRNMRG SRMNGIiPSPT HSAHCSPYRT RTWALSNSK KAKKVRFYHN 
15 GDRYPKGIVY AVSSDRFRSF DAMJfflLTRS LSDNIMLPQG VHYIYTIDGS RKIGSMDELE 

EGESWCSSD NFFKKVEYTK NVNPNWSVNV KTSANMKAPQ StASSHSAQA RENKDFVRPK 
LVTIIRSGVK PRKAVRVIiN KKTAHSFEQV LTDITEAIKIi ETGWKKLYT LDGKQVTCm 
DFFGDDDVFI ACGPEKFRYA QDDFSIiDBHE CRVMK(3IPSA TASPKASPTP QKTSAKSPGP 
MRBSKSPADS ANGTSSSQIiS TPKSKQSPIS TPTSPGSUtK RKDI>yi<PI>Sl:. DDSDSI^SM 



20 
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aeq ID NO: 215 DNA sequence 
Nucleic Acid Accession »: NM_a 
Coding sequence : 312 .. 644 



11 21 31 41 51 

I I I I I I 

GGCACGAGGC AGAGCTCTGC AAGGAGAGGT TGTGTCTTCG TTCTTTCCGC CATCTTCGTT 
CTTTCCAACA TCTTCGTTCT TTCTCACTQA CCGAGACTCA GCCGGTAGGT CTGCAGAGTG 
30 GTCTTCCTGG TAATTTAGTT GTQAGTGAAT GTGTGGAGGA GCCRGOSGGC TTAGGACAGG 
TCCTGTGGCA CAGTCCGTGG CTTTGAGGGA AAAGGGCCTC GCGGTGGTCC TCCGCCTTCC 
CCCAGGTCGT QATGCAGGCG CCATQGGCCX3 GTAATCOTGG CTGGGCTGGA ACGAGGGAGG 
AAGTGAGAGA TATGAGTOAG CATGTAACAA. GATCCCRATC CECAGAAAGA GGAAATOACC 
AAGAGTCTTC CCAGCCAGTT G6ACCTGTQA TTGTCCAGCA GOCCSVCTQAa GAAAAAOGTC 
35 AAQAAOAGCSA ACCACCAACT GATAATCAGG GTATTOCACC lAGTCaGGAG ATCAAAAATO 
AAGGAGCACC TGCTGTTCAA GGQACTQATO TQGAAGCTTT TCAACAGGAA CTGOCTCTQC 
TTAAGATAGA GGATGCACCT OQAOATCGTC CTGATGTCAG GGAG6QGACT CTGCCCACTT 
TCOATCOCAC TAAAaTGCTO GAAGCAGaiO AAGGGCAACT ATASGTTTAA ACCAAGACAA 
ATGAAGACT6 AAACCAACSUl TAITGTTCTT ATGCTGGAAA TTTGACTGCT AACATTCTGT 
40 TAATAAA6TT TTACAGTTTT CTGCAAAAAA AAAAAAAAAA AAA 

45 

1 11 21 31 41 51 

I I I I I I 

MSEHVTESQS SERGNDQESS QPVGPVIVOQ PTEEKRQEEE PPTDNQGIAP SGEIKNEGAP 
50 AVQGTDVBAF QQEtALLKIE DAPGOGPDVR EGTLPTFDPT KVLEAGEGQL 

Seq lO NO I 217 DNA aequence 

Nucleic Acid Accession ftt HM_001476.1 

Coding sequence: 82.. 433 

1 11 21 31 41 51 

I I i I I I 

GCC»GGGAGC TOTGAGGCAG TGCTGTGTGG TTCCTGCCGT CCXSGACTCTT TTTCCTCTAC 
TGAGATTCAT CTGTGTGAAA TATGAGTTGG CGAGGAAOAT CQACCTATTA TTGGCCTAGA 

60 CCAAGGCGCT ATGTACAGCC TCCTGAAGTG ATTGGGCCTA TGCGGCCCGA GCAOTTCSyST 

QATGAAGTGG AACCAGCAAC ACCTGAAGAA GGGGAACCAG CAACTCAACG TCAGGATCCT 
GCAGCTGCTC AGGAGGGAGA GGATGAGGGA GCATCTGCAQ GTCAAGGGCC GAAOCCTQAA 
GCTGATAQCC AG6AACA6GG TCACCCAC3«3 ACTGGGTGTG AGTGTGAAGA TGGTCCTGAT 
GGGCRGGAGG TGGACCCGCC AAATCCAGAG GAGGTGAAAA OGCCTGAAQA AGQTGAAAAG 

65 CAATCACAGT GTTAAAAGAA GACACX3TTGA AATQAT6CAO QCTGCTCCTA TOTTGOAAAT 
rrGTTCAXTA AAATTCTCXX: AATAAAGCTT TACAGCCTTC TGC34AAA 



Seq ID NO: 219 DMA sequence 
Nucleic Acid Accession ft: HM_001476 
Coding sequence: 90-3671 

1 11 21 31 41 51 

I I I I I 1 

ACAGOSGAGC GCAGAQTGAG AACCACCAAC CQAGGOGCCQ GGCAGCGACC CCTGCAGCG6 
AQACAOAOAC XGAGCGGCCC GGCAC06CCA TGCCTGCGCT CTGGCrQGGC 1 
QCtTCTCGCT CCTCCTGCCC QCAGCajQOS CCRCCTCCAG G 
ATQGQAAGTC CAGGCAiGTGT ATCTTTOATC GGGAACTTCA CAGACAAACT GGTAATQSAT 
TCOQCTQCCT CAACTGCAAT OACAACACTQ ATGGCATTCA CTGOGAGAAG TGCAAGAAXG 
QCTTTTACCG aCACAOAGAA AGGGACCGCT gm i a CCCTO CAATTGTAAC TCCAAAGOTT 
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CTCTTAGTGC TCGATGTGAC AACTCTGGAC GGTGCftGCTQ TAAACCAGGT OTGACAGOAO 420 

CCAQATGCGA CCGATGTCTG CCAGGCTTCC ACATGCTCAC GGATOOGaaG TGCACCCAAG 480 

ACCAGAGACT GCTAGACTCX: AAGTGTGACT GTGACCCAGC TGGCATCSCA GGGCCXrrGTQ 540 

ACGGQGGCCG CTGTGTCTOC AAGCCAGCTG TTACTGQAGA ACGCTGTGAT AGGTGTOQAT 600 

5 CAGOTTACTA TAATCTGGAT GGGGGGAACC CTOAGGGCTG TACCCAGTGT TTCTGCTATG 660 

GGCATTCAGC CAGCTGCOGC AGCTCTGCAG AATACAGTGT CCATAAGATC ACCTCTACCT 720 

TTCATCAAGA TGTTGATGGC TGGAAGGCTG TCCAACGAAA TGGGTCTCCT GCAAAGCTCC 780 

AATG6TCACA GCX3CCATCAA GATGTGTTTA GCTCAGCCCA ACX3ACTAGAC CCTGTCTATT 840 

TTGTGGCrCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

10 TTQACTACCG TGTGGACAGA GGAGGCAOAC ACCCATCTGC CCATGATQTG ATTCTGG AAO 960 

QTGCTGGTCT ACGGATCACA GCTCCCTTGA TGCCACTTGG CawaACACTQ CCTTOTGeGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAAT6 AGCATCC3UM5 CAATAATTGQ AGCCOCCAGC 1080 

TGAGTTACTT TGAGTATCGA AGGTTACTGC GOAATCTCRC AGCCCTCCGC ATCOGAaCTA 1140 

C3VTATGGACA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCCCX3CCCTG 1200 

15 TCTCXGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACAAGGGGC 1260 

AATTCTGCCA GGATTCTGCT TCTGGCTACA AGAGAGATTC AGCX3AGACTG GGGCCTTTTG 1320 

GCACCTCIAT TCCTTGTAAC TGTCAAGGGG OAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATOAGAAT CCTGACATTQ AGTGTGCTGA CTGCCCAATT aOTTTCTACA 1440 

. AOGATCCGCA CGACCCTCXSC AGCTGCAAGC CATOTCCCTO TCATAAOGGQ TTCAQCTGCT 1500 

20 CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACIO CCCTCCCGOO GTCROOSCSTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGQACCC CTTTGGTGAA CRTOGCCCAQ 1620 

TGAGGCCTTG TCAQCCCTGT CAATGCAACA ACAATGTGGA CCCXa«3TGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCX3GC ATCTACTGCX3 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

25 QAGCTTGCAA CTGTAACCCC ATGGGCTCAQ AGCCTGTAGG ATGTCGAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCXS» ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATC3U5TTTAT GCAGCAGCTT CAGAGAATQG 1980 

AGGCCCTGAT TTCAAAGGCT CAGQ6IGGTQ ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2040 

GCAGGATGCA GCAGGCTGAQ CAGOCCCTTC AGQACATTCT OAQAOATQCC CAQATTTCAG 2100 

30 AAGOTGCTAG C»GATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAQAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAOATGA CTQTGGAAAG AGTTCGGGCT CTGGGAAQTC 2220 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATQCAQ CTQAQCCTGG 2280 

taUHVAAGTOA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TACGTGGGGC 2340 

CAAATGGCTT TAAAAGTCTG GCTa«3GAGG CCACaU^GATT AGCAGAAAGC CACGTTGAGT 2400 

35 CAGCC3«JTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCXAAA CAAGCCCTCT 2460 

CACIGGTOCO CAAQGCCCTG CATGAAGGAQ TCX3QAAGOGO AAGOSGTAGC CCGGACGGTG 2520 

CIGTGGIGCA AGGGCTTOTG OAAAAATTGO AQAAAACCAA GTCCCIGGCC CAGCAGTTGA 2580 

CAAGGOAGGC CACTCAAGCG GAAATTQAAG CAGATAGQTC TTATCAGCAC AGTCTCCX3CC 2640 

TCCTOGATTC AGTGTCTCGG CTTCAGOOAQ TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

40 CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GQTAACCA(3G CATATGGATG 2760 

AGTTCAAGCX5 TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAG CTCT TAC 2820 

AGAATQOAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCCOGTGCX: AATCTTGCTA 2880 

AAAGCAGAGC AC»AGAAGCA CTGAGTATGG GC3UVTGCCAC TTTTTATGAA GTTGAOAaCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAOCTQAAQ 3000 

45 AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCTAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGO 3180 

AAGCCAKTGI GACAaCAOAT GQAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAAGTGGAA GOAGAGCTGG AAAGGAAGGA GCTGGAQTTT GACACGAATA 3300 

50 TGOATGCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGATACX^GA GCCAAGAACS 3360 

CTGGGGTTAC AATCCAAGAC ACACTCaUVCA CATTAGACGG CXTTCCTGCAT CTGATQGACC 3420 

AGCCTCTCAG TGTAGATGAA GAGGGGCTGG TCTTACTGGA GCAGAAGCTT TCCXX3AGCCA 3480 

AGACCCAGAT CAACAGCCAA CTGCGGCCCA TGATGTCAGA GCTGGAAGAG AOGGCACSTC 3S40 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA C3UVGCATAGA TGGGATTCTG GCTGATGTGA 3600 

55 AGRACTIGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACASTQ AAGCTGCtavT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTOGGGAO CXaTQTCATG TaAGTGQGTa GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAGG TCAACTGACX: TOACCCCATT CCTGATCCCA TGGCCAGGTG GTTQTCTTAT 3840 

TQCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAOA 3900 

60 ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCAC3U3GCAO 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAS TCCTGSAATT TGGACAAGTQ CTCrrrGGORT 4020 

ATAQTCAACT TATTCTTTGA GTAATGTGAC TAAAGQAAAA AACTTTGACT TTGCXX3K3GC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCC3W3T CACACTOTGO CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAOAGTTCCT CCTACTTACA 4200 

65 ACCCAGGGTG "TGAACATG-rT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTQ 4260 

AGGACCTGTA AGGCAGGCCC ATTCAOAOCr ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTCGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TA GAGATTG C 4380 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GC31AATGTXG GGAAAGIATT TACITTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TClGGCTTGa GCATTGAAAG AGGTAAAATT CTCTAGATTT 4S00 

70 ATTAGTCCTA ATTCAATCCT ACTTTTOGAA CACCAAAAAT GATGCGCAIC AATGTATTTT 4S60 

ATCTTATTTT CTC3UVTCTCC TCTCTCTTTC CTCCJWaXAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCX3V 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTGGGACAGT GQTGACATAQ TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAGGAAGACA 4800 

75 AQCRTTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA C»TATGTTGC AAGACCCTCC 4920 

CATOGGGGCA CTWIAGTTTT GGCAAGGCTG ACAGAGCTCT QGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATT6C AACAGACTGT TGAGTTATQA 5040 

TAACACCRGT GGGAATTGCT GGAG6AACCA GAGGCACTTC CAOCITGGCT GQGAAGACTA 5100 

80 TGGTGCTGCC TTGCTTCTOT ATTTCCTTOG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTA6 ATGCC 

85 
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MPALVnjGCCIi CFSLLI.FAAR 
tXSIHCEKCKN GFYRHRERDR 
HMLTDAOCTQ DQRLLDSKCD 
PEGCTQCPCY GHSASCRSSA 
SSAQMJ>PVY PVAPAKFLGM 
MPU3KTLPCX3 LTKT!fTPRUf 
DNVTI.ISARV VSGAPAPWVe 



OJNCPPGVTG 
CIHNTAGIYC 
HCEHGRFSCP 
QDILRDAQIS 
RLITQMQLSIi 
ETEDYSKQAL 
ADRSYQUSLR 
trnKEEAQQUi 
QVDNRKAEAE 
IBQEIGSUn. 
AQKVDTRAKH 
MMSGLEERAR 



ATSRREVCDC 
CLPOJCNSKa 
CDPAGIAGPC 
EYSVHKITST 
QQVSTCQSLS 



PCTAJS02/12476 



FHQDVDGMKA VQUiGSPAKL QWSQRHQDVF 
FDYRVDRGGR HPSAHDVltE GAOUllTRPL 
LSYFEYRRLL KHLTALRIHA TVGEYSTGVI 
QFOQDCASGY KRDSARL6PP GTCIPCNCQG 



ARCELCASGY 

DQCKAGYFGD PIAPNPADKC 
ACYNQVKIQM DQFMQQIiQRM 
EGASRSLGLQ LAKVRSQENS 
AESEASLOrr NIPASDKYVG 
SLVSKALHEQ VGSGSGSPSG 
LLDSVSRI^ VSOQSFOVEG 



VRFCQPCX3CH 
RAQICHPMOS 
BALISKAQGG 
YQSRLDDLKM 
PKGFKSLAQE 
AWQGLVEKI. 
AKRIXQKAQS 



NNVDPSASGN CDRI.TGHCLK 



EAMKRX.SYIS 
EAMVTADGAL 
AGVTIQDTLN 



DGWPOTEbE GRKQQAEQAL 
TVERVRALGS QYQNHVRDTH 
ATRLABSHVE SASNMEQLTR 
EKTKSUMQQIi TREATQABIE 
IiSTLVTRHMD EFKRTQKNLG 
GHATFYBVES ILKNLREFDL 
QKVSDA80XT QQAERAI^GSA AKDAQRAXHO AGEMSISSE 

ERKEIiEFDTN MDAVQHVITE 

VLLEQKLSSA XTQINSQLRP 
PPGCYITTQAL EQQ 



AMEK6UVSLK 
TLOOLiaiiMP 
TSIDGILADV 



loao 

1080 
1140 



Seq ID NO: 221 DNA sequence 
Nucleic Acid Accession ft: 1IM_016529 
Coding sequences 13-1854 



GTCAAQAAAA 
AAAGGGGCTG 
ACATTATGCC 
GCTGATCTCT 
ATATTGAAGG 
CTQCTACTTQ 



ATAATGTGAT 
ATCTGGAATA 
CTGAGAATGA 
ACAGAGCTCA 



GCBATTAATA 
AAGGAG6ACT 
AATTTGCTGG 
GOGCTCTCCT 
ATATGCTGCA 
GTGAAGGCCA 
GCCCACOTGG 
TACGCCATCG 



TAGGO TATTC 
CTTTGQATGC 
GCAAGGAAAA 
TCGAAGTCCG 
GAGTGTCTCC 
TCACCCTCGC 
GTGTGGGAAT 
CACAGTTTTC 



21 
I 

- AATTGTTCGA 
TTTTGAGAGA 
CTTTGCCACG 
GTATGAGGAG 
AOGGTTGGAA 
CATAGAAQAT 
AATTAAAATA 
CTGCCGATT6 



31 



51 



I 1 1 

ACTCCTTCAG GACGACTTCG GCTTTACTGT 
CTTTCAAAAG ACTCAAAATA TATGGAGGAA 
GAAGGCTTGC GGACTCTCTG TGTGGCTTAT 
TGGCTGAAAG TCTATCAGGA AGCCAGCACC 
GAGTGTTAOG AGATCATTGA GAAGAATTTG 
OGCCTTCAAG CAGGAGTTOC AGAAACCATC 



TGACGTGGCC 
GAGGAGTTTC 
TCTGCAGAAQ 
CATCGGAGAC 
CAGTGGGAAT 



TGCATOGGCC 
GAGAGGTCTT 
AATGGCGAAG 
TCCCTC31TCC 
GGTCATGCTA 

TGGGGAAGCA 
ATTCCCATTG 
TGGTTGGGAT 



GCACrCAGGA 
GCTTCAACAC 
TCrrCTGGTT 
CCGACTATTT 



CATCTTGTAC 
TGTTAATQGA 
GATTTTCACC 



AAAGGTTTTC 
TCXrCATGAAA 
ATTTGTTGGA 



COAGTCCrGG 



C»GCfteGGCG 
6AAGAAGTCA 
AATTTTCCTG 
TTTGTCAGAG 
AGTTAAGCAG 
AGCTATCTTT 
ATQAAGCATT 



TGCTGACCTG GCTGGTGTTT 
CTCCAGATAT GAGAGGACAG 
TATTTCTGGT TCCTACTGCC 
CCIGCAAAAA GACATTGCTG 
GAAAAGCGGT GCTGCGGQAT 
ASAQGCTGGG CCGGAAGAGQ 
TCCCGCATQG 



GCCATTACTC 
CTCATCATCG 
CTGGATTTGQ 
TCTGA6ATAG 
GGCGCCAACG 
GAAGGCATGC 
AAGCTTCTGT 
TGCTTCTATA 
TTTTCTGGGC 
GCTTTGCCGC 
AGGTTTCCCC 
TGGGGTCACT 
GCTCTGGAGC 
AATATTGTTT 
GCTTGGACTA 
TTTGGCATCT 
GCAACTATGS 
TGTTTGATTG 



AGCAATGGAA 



ATATGGCCCT 
AGCACTGCAC 
ATG6CCACAC 
CACTCTCGTG 
TGGATGTGGT 
ATGTCGGGAT 
AGGCCACCAA 
TGGTTCATGG 
AGAACGTGGT 
AGATTTTATT 
CCTTCACTCT 
AGCrCTACAA 



tatcx:tattg 
tgaccttggg 
cctgaaotac 
caaagcggtc 
gaagaagcg6 
gatccagaca 
caactcggat 
agcctggagc 
cctgtatatt 

TGAAajTTGG 
GGGAATCTTT 
AATCACCCAG 
CTTGGTCCAC 



ACTCGACCAT 
TCCTGAGCTC 
AAGATGTGGC 
AGGAGCTGGA 
AGAGGCTGAA 



TGTTGTTACT 
TCTGGCTQTC 
CTGGCCCACC 
CGCACACTTC 
ATGGAGAGCA 
AACCAAGTCT 



1020 
1080 
1140 
1200 
1260 
1320 
1380 



TCCGraCTTA TGACACCACC 
ACTGATCITA GOAAAGKBAT 
AAGACTGGOO TCCAAGGCXA 
TTTGTTAGTT ACAXATTCCC 
GCCCTCCCAA CTCGTCtGCA 
CAACTGTGCT CTGTSAGOTC 
AAAAAAA 




MSVIVRTPSG 



KABIKIHVIiT 
KENDVALIID 
TLAI(33GAIin} 



EUJUiYCKGAD NVIFERLSKD 
YQEASTILKD RAQRLEECYE 
GDKQETAINI GYSCRLVSQM 



31 

1 

SKYMEETLCK 
IIEKMLLI.LG 
MAIiIIiIOCEDS 
LSCKAVIOCS 
ATMNSDYAIA 



VGMIQTAHVG 

NVVLYIIELH FAFV1R3FSGQ 
LYKITQNGBG PHTKVFWQHC 
TYVWTVCLK AGLETTAWTX 
I.SSAHFWIiGL PLVPTACI.IB 
RAVLRDSMOK RLIIEROItI.IK RLGRKTFPTI. FRGSSLQQGV 
BAYOTTKKXS RKK 



Seq ID NO> 223 DNA 
nucleic Acid AccoBsion 
1-394 



IiEYFATEGIiR 
ATAIEDRLQA 
UlATRAAITQ 
VSPIOKSBIV 
QFSYIiEKLIdi 
YNVIFTALPP 



TLCVAYADLS 
GVFETIATLL 
HCTDI.GNLU3 
DWKXKVKAI 



FTUSIFERSC 360 



LTWLVFFGIY 
CKKTLLEEVQ 
FHGYAFSQBE 
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AACGCTGGGC AGGGCCGGCG CGGGTCGGGQ GGCGCCOGftQ GGGtXJOQGGC CGAGCX3GOGG 60 

CGCGCAGGGC GGCAGCATCC ACTCGGGCCG CATCX3CCGCG GTGCACAACG TGCCGCTQA6 120 

CGTGCTC3VTC CGGCCGCTGC CGTCCC3TGTT GGACCCOSCC AAGGTGCRGA GCCTCGTGGA IBO 

5 CACX3ATCCGQ GAGGACXX3U3 AC3«3CGTGCC CCCCATCGAT GTCXTrCTGGA TCAAAGGGGC 240 

CCftGGGAGGT GACrJKrrTCT ACTCXOTTGG GGGCTGCCAC CGCTACGCGG CCTACCAGCA 300 

ACTGCAGCGA GBflACCATCC CXMCCaAGCT TGTCCAGTCC ACTCTCTCAG ACCTAAGGGT 360 

GTACCTGGGA GCATCCACAC CAGACTTGCA GTAGCAGCCT CCTTGGCaCC TGCTGCCACX: 420 

TTCAAGAGCC CAGAAQACAC ACCTGGCCTC C3VGCAGGCTG GGCCATGCAG AAGGQATASC 480 

10 AGGGGTGCAT TCTCTTTGCA CCTGGCGAGA GGGTCTGACT CTGGGCACCC CTCTCACCGQ 540 

CTACAAGGCC TTGGACTCAC TGTACAGTGT GGOAGCCCCA GTTCCCACCT CTGTOACAAT 600 

AOOATCATGG CCTTACCCTT GAAGCATTAC CGAGAAGGAG AACAGAGATG GGCTTGAAGA 660 

GCCACGTCCT GCCGGCTCCA AATTCCCAAG GACAAGGATC CCTCTGCATT TTTGTCTATG 720 

TAACCTCTTA TATGGACTAC ATTCAGCTGC AAGOAAAGOA AAACCTTQAT TGCAGTGGTT 780 

15 TAAACAAACA GAAGATTGTT TTTCCACATA GCATGGATTC TGGAGATGGG TGGCTAATGG 840 

TATTGGTTCA ACAACTCCAC GGAGGTAGGG GTCACGTCTT GGATCCTTTT GCCTTAATCT 900 

CAGTGCTCGT TACTTCATGG TCCCAAGATG GCTGCTGTAT CCCCRAGAAT CATGTCTGCG 960 

TTCAAGOAAG GAGGGGTGGA GGAAGAGGAA GGGCC3UACT AGCTGGACCC GTCACXHTXTT 1020 

ATCAGAAAGT AAAACCTCGT CAGAAGTCTG TTTCCTGCTC TCTCCCTCTG CATATCITCa 1080 

20 CTTAGATGCC CTTGGCCCGA GCCAGCTACC ATTGCAOCTC TAGCTGCAAA CftAAGCTARQ 1140 

ACAGCAGGGA ACAGAATTQT CATGGCTGAA TAGACCAATC GTGTTCCATC TACTGAGACT 1200 

GGCACACTGC CTCCTGCAAT AAAACTGGGA TCCCATTACC AAQAGAGAAA TGCSMSAATTG 1260 

TGTACCAGTT AGCTTTTGCT GTGTAACAAA CCATCCCCAA ACTTGGCAGC TAGAAACAAA 1320 

OCCrGTATTT TCOCACAATC CTATGGGTTG GCAATTTGGG CTGGGCTCAA CAGGGCAGTT 1380 

25 CTGCTGCTCA CACXTIGGGAT CCCTCATGGA GCTAAGGTCA GCTGTTACCT CAGCTGGGCC 1440 

TGGATGGTCT AGGATAGCCT TACTCACTTa CCTGGCAGGT GACAGGCTGT TGGCTGGAAT 1500 

TQCTTGGTTC TCCTCC»TGT GGCCTCTCCA GCAGGCTAGC TCAG GCTT AT TCACATGATG 1560 

GCTTCAG6AT TCCAAAGAGA GTGAGAGraa AAGCT6AAAG ACTTCTTOAQ TTCTTGGCCT 1620 

GGAACTGGGA CTAGGACAGT GTCACTTCTG CTAAGTTCTT TTGGTCAGAG CAAATCACAA 16B0 

30 GGCTTTACCC AGATTCAAGG GATGAGAAAC AGACTACATO TCTTGATOAO GGGAACCACA 1740 

AAGAGCTTGT GGCX»TTTTT CACCTATCAC AAATAATTTT GGATGGGTAT TTATTTGGAT 1800 

AAAGGTATTT CCCTCTTCCC CCTTTCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG 1860 

GAAGGCACTA AGACATTGTC CTGGCCCTCA GCGTCTAGGG GAAGAGGTGT TGGGGCAGGA 1920 

AGTGAGTCTC TCC3VTGGGCT GGACCCACTG TASTAGGAOT GCCTCCrrOT CTGCACTGCT 1980 

35 GGTATGGGGT TAGGCCAGGT AGGACATTCC AGAGGGGCTT CTGAAAACCA AGAGTCCCTG 2040 

GGGAAAGGGA ACAOAGTAAG GCAGGCCTTa TTCTCACTGC CCTCTAAGGG AACTTG GTCA 2100 

CrcGGCACTT TTAAGCCTCA GTTTCTCCAG TTCAATAATA AGGACAAGAG CTTTTCXX»T 3160 

GCATTCTCIT TCCCCGGQAR ASTTGRCIOR QSTOAOCRGT AATAOAATTO AAAACGGAfiA 2220 

GTGTCTTCAG XGCAATGTGG CATCCTGGAT TGGGTCTTGG AACRAAAACR GGAC3VTTAOT 2280 

40 GGGAAAATTG GAAATCTGAA AAAAGTCTGA ATTTTAGTTA ATATACCMT TTCRGTCTCT. 2340 

TGGTTTTOAC AOATOTACCA TGGTGATGTA AOATGTTGAC CTTQGGOTAG GCTGGGTOAA 2400 

GGOTATACAfi GAACTCTTTG TACTATCTCP OCAACTTCrC TGTAAATCTA GTATCaTTCC 2460 

AAAATAAAAG TTTATTTAAT TTAAAAAAAA AAAAAAAAAA AA 



TUatAOMlRG APEGPGPSGO AQGGSIHS6R lAAVHNVPLS VLIHPLPSVL DPAIWQSI.VD 60 

TIRBDPDSVP PIDVLWIKGA OGGDYFYSFQ QCHRYARYOQ LQRBTIPAKL VQSTLSDLRV 120 
YLGASTPDU] 

55 Seq ID MO< 225 DNA sequence 

Mudelc Acid Accession #t NM_021048 
Coding sequence > 1 . . 1110 

60 1 . 11 21 31 41 51 

ATGCCTCGAG CTCCAAAGCG TCAGCGCIGC ATCCCTQAAG AAGATCTTCA ATCCCWAAGT 60 

GAGACACAGG GCCTGGAGGG TGCACAGGCT CCCCTGGCTG TGGAGGAGGA TGCTTCATCA 120 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCX: 180 

65 TCCTCCTGCr ATCCTCTAAT ACCAAGCACC CCAGAGGAGG TTTCTGCTGA TGATGAGAC3V 240 

CCAAATCCTC CCCAGAGTGC TCAGATAGCC TGCTCCTCCC CCrCGGTOST TGCTTCCCTT 300 

CCATTAGATC AATCTGATGA GGGCTCCAGC AGCCAAAAGO AGGAGAGTCC AAGCACCCTA 360 

CAGGTCCTGC CAGACAGTGA GTCTTTACCC AGAAGTQAQA TAQATGAAAA GGTGACTGAT 420 

rrooTGCAtyr xTCTGCTcrr caagtatcsu^ atgaag gagc oqatcaca aa ggcagaaata 480 

70 CIGGAGAGTG TCATAAAAAA TTATGAAGAC GACTTCCXar TGTTGTTTAQ TGAAGOCTCC 540 

OAGTGCATGC TGCTGGTCTT TGQCATTOAT GTAAAGQAAG TQOATCCCRC TCGOaCTCC 600 

TTTGTCCTTG TCACCTCCCT GGGCCTCACC TATOATGGGA TaCXQAGTGA TGTCCAGAGC 660 

ATGCCCAAGA CTGGCATTCT CATACTTATC CTAAGCATAA TCTTCATAGA GGGCTACTGC 720 

ACCCCTGAGG AGGTCATCTG GGAAGCAOTG AATATGATGG GGCTGTATGA TQGGATGGAG 780 

75 cywxrrcATT-r atggggagcc caggaagcig ctcacccaag attgggtgca ggaaaactac 84 o 

CTGGAGTACC GGCAQGXGCC TGGCAGTGAT CCTGCACGGT ATGAGTTTCT GTGGGCTCCA 900 

AGGGCTCATG CTGAAATTAG GAAGATGAGT CTCCTGAAAT TTTTGGCCAA GGTAAATGGG 960 

ASIOATCCAA QATCCTTCCC ACTQTGQTAT GAGGAGGCTT TCAAAGATQA 6GAAGAGAGA 1020 

GCCCAGQACA GAATTGCCAC CACAGATGAT ACTACTGCCA TOGCOVOTGC AAGTTCTAOC lOBO 
80 GCTACAGGTA GCTTCTCCTA CCCTGAATAA 



85 



276 



60 



75 
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SSCYPLIPST PBEVSAODET FNPPQSAQIA CSSP&PWASL PLDQSOEGSS SQKEESPSTL 120 

QVI.POSBSI.P RSBIOEaCVTD IiVQFLLFIOQ MKEFITKASI LESVIKNYED HFPLtiFSEAS IBO 

ECaOiLVraiD VKEVDPTGHS FVI.VTSLGLT YDGMLSDVQS MPKIGILILI LSIIPIEQYC 240 

TPBEVIWEAli mwahYDOm RLIYQEPRKL LTQONVQBHY IfYRQVPGSO PARYEFIMGP 300 

5 HAHAEIRKMS LLKPLAKVNG SDPRSFPLWY EEaVLKDEEER AQDRIATTDD TTAMASASSS 360 
ATGSPSYPE 

Seq lO MO: 227 DNA sequence 
Huclelc Acid Accession ff : NM_00S02S.l 
10 coding sequence: 82-1314 

I 11 21 31 41 51 

I I I I I I 

GCGGAGCACA GTCCGCOC5AG CACAAGCTCX: AGCATCCCGT CAGGGGTTGC AGGTGTGTGG 60 

15 GAGGOTTGAA ACTGTTACAA TATGGCTTTC CTTGGACTCT TCTCTTTGCT GGTTCTGCAA 120 

AGTATGGCTA CAGGGGCCAC TTTCCCTGAG GAAGCCATTG CTGACTTGTC AGTGAATATG 180 

TATAATCOTC TTAGAGCCAC TGGTGAAGAT GAAAATATTC TCTTCTCTCC ATTGAGTATT 240 

<3CTCTTaC3UW TGGQAATOAT GGUUVCTTGQO GCCCAAGQAT CTACCCAQAA AGAAAT CCGC 300 

CACICAATGG GATATGAOWi CCTAAAAAAT GGTGAAGAAT TTTCTTTCTT GAAGCJAGTTT 360 

20 TCAAACATGG TAACTGCTAA AGAGAOCCAA TATSIGATGA AAAn6CX3Ul TTCCITBTTT 420 

GTGCAAAATG GATTTCATGT CAATGAGGAG TTTTTGCAAA T6ATGAAAAA ATATTTTAAT 460 

GCAGCAGTAA ATCATGTGGA CTTCAGTCAA AATGTAQCXX5 TGGCCAACTA CATCAATAAG S40 

TGGQTOGAaA ATAACACAAA CAATCTQGTG AAAQATTTGO TATCCCCAAG GGATTTTQAT 600 

GCTGCCACTT ATCTGGCCCT CATTAATGCT GTCTATTTCA AGGGGAACTG GAAGTCGCAG 660 

25 TTTAGGCCTG AAAATACTAO AACCTTTTCT TTCACTAAAQ ATGATGAAAG TOAAGTCCAA 720 

ATTCCAATGA TGTATCAGCA AGGAGAATTT TATTATGGGQ AATTTAGTGA TGGCTCCAAT 780 

GAAGCTGGTG GTATCTACCA AGTCCTAGAA ATACX:aTATG AAGGAGATGA AATAAGCATG 840 

ATGCTGGTQC TOTCCAGACA GGAAGTTCCT CTTGCTACTC TGGAGCCATT AOTCAAAQCA 900 

CAGCTGGTTG AAOAATGGGC AAACTCT G TS AASAAGCAAA AASTAOAABr ATACiCTQCCC 960 

30 AGGTTCACAG TQGAACAGGA AATTGATTTA AAAGATGTTT TGAAGGCTCT TGGAATAACT 1020 

GAAATTTTCA TCAAAGATGC AAATTTGACA GGCCTCTCTG ATAATAAGGA 6ATTTTTCTT 1080 

TCCAAAGCAA TTCACAAGTC CTTCCTAGAG GTTAATGAAG AAGGCTCAGA AGCTGCTGCT 1140 

GTCTCAGGAA TCATTGCAAT TAGTAGGATG GCTGTGCTGT ATCCTCAAGT TATTGTCGAC 1200 

CATCX31TTTT TCTTTCTTAT CAGAAACAGG AGAACTGGTA CAATTCTATT CATGGGACGA 1260 

35 GTCATGCATC CTGAAACAAT GAACACAAGT GGACATGATT TOGAAGAACT TTAAGTTACT 1320 

TTATTTGAAT AACAAGGAAA ACAGTAACTA AGCACATTAT GTTTGCAACT GGTATATATT 1380 

TAGGATTTGT GTTTTACAGT ATATCTTAAG ATAATATTTA AAATAGTTCC AGATAAAAAC 1440 

AATATATGTA AATTATAAGT AACXTQTCAA GGAATGTTAT CAGTATTAAO CTAAT6GTCC ISOO 

TGTTATGTCA TTOTGTTTGT GTGCTGTTOT TTAAAATAAA ACTACCTATT GAACATGTG 



1 11 21 31 41 51 

I I I I I I 

MAPLGtiFSLL VLQSMATGAT FPEEA1ADL3 VHMYNRUIAT GEDEHIUSP I,SIALAMGMM 
EUSAOGSTQK EIRHSMGXOS LKNQEEFSFL KEPSNMVTAK ESQYVMKIAN SLFVQNGFHV 
NBEFLQMMKK VPNAAVNHVD FSQNVAVANY IHKWVEKNm NLVKDLVSPR DFDAATYLAL 
INAVYFKOIW KSQFRPENTR TFSFTKCDES EVOIPNMYQQ GEFVY6EFSO GSNEAGGIYQ 
VLEIPYEGDB ISMMLVLSRQ BVPLATLSPL VKAQIiVBEWA NSVKKQKVEV YIiPRFTVEQE 
IDLKDVLKAL GITEIPIKDA NbTGLSDHKE IFIiSKAIHKS PtEVNEEGSE AAAVSGMIAI 
SRMAVLYPQV IVOaPFFFLI RHRRTGTILF MGRVHKPETN NTSGUUFEEL 



ATCTGGTGAA GAAGGACTGT GCBGAGTCGT GCaCACCCAG CTACACXXTTG CAAGGCX3VGG 
TCAOCAGOGG CACCAGCTCC ACCCAGTGCT GCCAGGAGGA CCTGTGCAAT GAGAAGCTGC 

65 ACAAC6CTGC ACCCACCOGC ACCGCCCTCJG CCCACAGIGC CCTCAGCCTG GGGCTGGCCC 
TGAGCCTOCT G6CCGTCATC TTAGCCCCCA GCCTOTGACC TTCCCCCCAG GGAAGGCCCC 
TCATGCCTTT CCTtCCCTTT CTCTOGGGAT TCCACACCTC TCTTCCCCAG CCGGCAACGG 
GGGTGCCAGQ AGCCCXSlGGC TGAGGGCTTC CCCGAAAGTC TGGGACCAGG TCCAGGTGGG 
CATOQAATGC TQATGACTTO GAGCAGGCCC CACAGACCCC ACAGAGGKIG AAGCX3VCCCC 

70 ACAOAGGATa C3^CCCCCAQ CTGCATGGAA GOTGOAGGAC AQAASCCCIO TGGATCOCOQ 
QATTTCACAC TOCTTCTCTT TTSTTGCX g T TTATTTTCTA CTCAAATCTC TACA1GGAGA 
TAAATGATTT AAACC 



I I t I I I 

MHTAIiLLAA lAVATCPALT LRCHVCTSSS KCKHSWCPA SSREOCTTOT VEPLEGHLVK 
KDCABSCTPS YTLQGQVSSG TSS1X2C0QB1 UMEfOiHNAA PTRTALAHSA LSLGIALSU. 
AVILAPSIi 
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85 Nucleic Acid Accession tti Eoa sequence 
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WO 02/086443 
1 11 
I I 

CCGGGCAGOT GGCTCRTGCT 
21GGGGCGCAO GAATTCTGAT 
AGAAGATQAA GGATATCGAC 
GTGTGAGGtSA GAGAACCAGC 
GGAGAACTCG ACCGTTGGAA 
TCTCTCTTGA TGCCTCCATG 
GAAAGTACCA TCATGGCTTG 
ACXX»GTGGA CAATGCTGGG 
CCGGTGTBGC CCACAAOAAG 
RCGAGTCTTC TGACGTQAAC 
AAGTTQGGCC AGAGGCTGCT 
TCATCCTGTC CATOGTQTGC 
TTCAGGATGG CTGTATTCTG 
CAAGAOTTCA GCCTTCCTTT 
TTCACTGCAC CGCCATCTTA 
TGACTCAGTT CXMCATTTTG 
ACCAGCAACC CGGCTGTATA 



PCTAJS02/12476 



CAGTCTGTGA 
AGTATATCAT 
CX3CACAGAGA 
CCTTGGAAAC 
TCAGAATCCT 
AGCCCATCCG 
GTATGACTTT 



GAGAGAGTCA 
CCATCAGTGC 
TCACGTGAGG 
TGGAAAAGAA 



TTATTTCCTQ Q6ACTTGGCA 
TCTOGTATTA ATTTAATCTC 
GGAATTCAGC GTAGCTACCT 
AGCTCTGACT TACAGCTGCA 
GGCTTCACAT CAATTTTTTT 



TGCTGTCAGC CTGATATACT 
TTCTACTGTG TGTCGTTAAA 
TGTATAACTC GACTCTGTAT 
GTGTrTGTGA GATGCTCTTT 



GTGAAACTAA 
ATAGGAAAAG 
ACTTCTGGGA 
TGCCAAGATG 
CATTCTCAGC 
AGTGCTCTGA 
CTTTTTTCCT 
GGGGAGCTCT 
TSCAQAAOAC 
TCCCTGCOAA GGGTTGTGTG 
CTGATQATCA 
CGGTCAGAAT 
GGAGACTGCT 
CTGAGTTGCT 
6ATTGCATAC 
CAGTGGTGAC 

ACAGtinsnc 

AAAATCTGAT 
AGGAAAAACA 
CCAGACCXSTG 
GTCACCTTTQ 
CTTCCTTTAG 
TGTCAAGATT 
TTCCTTGAGG 
TGCACCAGTA 
TTTGTTTTTA 
TGCAGAATCA 
ACTTTGGACT 
CAACCGTCGG 
GTTTCAATGT 
GAAGATGGTA 



GCCCTGGAAC 
CJCCCAGTCCT 
CCGTX3AAGAT 
AGCAGCCCGA 
GQATGAGGAG 
GACTACTTCC 
TTCGTGGCTT 
CGTGTGGTCT 
OTGGCAAGAA 
GATCrrCTGC 



GTCCTGGAGC 
CTCCACTCAQ 
GGGTATAGAA 
TCCAAGTTCA 



AAACACCAGC 420 



TTGGAATCTG 
TGGTGGGGAT 
AGAAATTAAC 
GTGTCTGGCC 



GCCAATCTTC 
GATATAAACC 
AAGTCTCCTA 
CTCCTAGGAC 



CTGTCCAAGC 540 

GAGCTGAATG 600 

OGCACCAGGC 660 

GGACCAAATT 720 

3 AATCTCTCX3C 790 

f GGGAACAGGC 840 

TTTGGCCCTG 900 

TTGCTAGTAA 960 

TAAAAATCTG 1020 

TTTGATCAGG 1080 

CTAGTGGACA 1140 
1200 



GGTGGGGGAT 
GATTTTTAGA 
CCTAGGATTT 
GCACCAGCCA 
ATTCTCTTTC 
TTGCAACXrrC 
CTGGAAACAG 
AGACCAGATG 
ATGTTACIGC 
CTTTTATATT 



TCCATTTTTQ TCTCTCATTC 
CCTaaGTAGA AQGGTaaATG 
TGGTTTGGCT TTCTTTTGTT 
TGCAAGGACT TGAAAAGAOC 
TTTATTCTGT CCCQAGCAGA 
TTTTGAGCAG AGTACCTCTT 
CTTAGCAGCA AGGTCTTTTT 
AGGAGCCCTC ACTGATTGAG 
ATATGGGTTC TATTCTCTAT 
ACCTGTTAGA TGGCTAGTCC 
AATGCTTCAC CTGCTGTACA 



1 11 21 3: 

I I I I 

MKDIDIGKBy IIPSPGYRSV RBRTSTSGTH 
MJASMHSQLR ILDBEKPKGK YHHSLSAItKP 
VAHKKGELSM EDVWSLSKHB SSOVHOUtliB 
LSIVCLMITQ LAGFSGFNFQ DGCIIiRSS 



TGI 

TAGTAAACTT 
TTTGTTTCCA 
TGTTTGACTG 
TTGTTATGGT 
AAGCTATTTT 
TATATTATGG 
ATTASCCTAC 
AATAGGTTAA 
AATTAAAGTA 
TATGAAAGCT 
TATATATTTT 
TATCCTGTAT 



11 
I 

[ GCrCATATAT 
V ATCRC3VTAAT 
ATCGTA6TCA 
CTCTTTCRTC 
AASTCACAAT 
AAAGC3\AAAC 
ATATCTTTTT 
AATCATCAAG 
ATTAACCAGA 
TATA6XATTA 
AAAOIRGTTA 



GATTCTGTTA 



CTTAGGCTCr 



TTAGTTCCTT 
TTTTTTTAAG 
ACTCTTACTA 
GTAGTTTAAA 
AGCATCCAAA 
AGATQGAGAA 
CKXRTGTAT 



ACTATAATAG 
TAT^AATOTT 
TACATAAAAA 
AAGTCATTAA 



GGAAGATGGA 



AAOTTTGAGT 
ATCCAATGAT 
GTGCTCATGT 
TAATTTAATA 
GAGGACTTTT 
6CCAIGCTTG 



CIGTrCRGGT 
TGAGATOGAG 
ATCATATGGA 
TTTTAATTCT 
CSkATCXXXaVA 
TTTCTGOAAT 



TAGTAATGTA 
ATAACTAGCA 
TAAGATTATT 
GAGAAAACAT 
GAGGTGGGCA 
AGTTTGGAAA 
CACTGTCTCA 
GGQGa«3TGT 



TTTGTGTTAG 
TGAATTATTC 
AACGTGACAG 
ATTAAATATT 
TATGGAAAAC 
ATTGT ATCAT 
ATAAATTTTT 
CAAATTAAAC 
GAGCTXTATA 
TCXrrOGGATA 
TGAGATAACT 
AATTGTTTTA 
GGTTCCCTAA 
GCAAGCACTQ 
AACAATGCCT 
TACCTCATOT 
CTTTCTTAAG 
TAOTTCAGAA 
AATATTTTAA 
TTTTATTAAA 
ATTTGAATTT 
ATTGTTTTAA 
AATATGGAAT 
TTTTTCCAAA 
AAATTAGTGC 
CXCAGTTAAG 
TTTATTACAT 
ATTTCTTCTC 
AATGGATTTT 
CATTTAAGGT 
AGAGAGCTTA 



31 
1 

TIGTTGTTTA 
TTTTTTTACT 
CC 3VATAA CTC 
CTCTCTAGTC 
TTAGATACCT 
TTTATTTTCA 
TATTTTGACT 
AAATTACTAT 
TTTTGGCCTA 
CAQTTGGTTT 
TTACTAATTT 
ATACCTTAAA 

AATTTCTAAT 
TAAATAGGTC 
GGATCTGCCA 
ATACX^GTGG 
AATTTCAGTT 
ACTGTGACTT 
GAAAAGTTTC 
AAOTTAATAG 
TAAATTTATA 
AAATTACCTT 
TAAATAATTT 
ATCAGTGGTT 
CCaCCTCaTA 
AGCTCTTTGA 
AGGAAAATGC 
TTTTGATGGT 
GAAATTTAAG 



CAGTTCACTA 



41 

I 

GTTTTACTTA 
TTTACTCCCC 
TAAACTTTTO 
TTAACCTGGA 
TAAGCCACTG 
AACACTAACT 
AAGCTTTCAT 
TGCATTTTCC 
ATGTCTGQAT 
GGGCAAATTT 
ATACCTGATT 
AAGTTGGTTC 
ATTATATAGT 
TATATATGTT 
ATAAGATACA 
TAGATTTTTT 
GAGTTGGTCT 
CTTAGGTTAT 
GAAAATGAAT 
TGAGTQTGAT 
GAAATCTTTT 
TAATTCAAAT 
ATTATTAGAA 
ATTTAAATCA 
TTCAACCCTC 
ACAATTAAGT 
GTGATTCTAA 
TTTTATTTCT 
TTAAATTCCA 
AGTTTAAGTT 
AAATAGATAT 
CTCTGGAGCT 
ACCTATG6TT 



1440 
ISOO 
1560 
1620 



RDREDSKFRR TRPLECQDAL ETAARAE6LS 



VWIFCRTRI.I 



51 
I 

TTGAGAGTGT 
AAATTATTCA 
AGTTATAAC6 
TTTTAATTTT 
AATTCRGTTC 
TCTTGATATT 
AAAATATTTG 
TATATATGCA 
ATAAAAGATA 
AAACCTGAAA 
TTTTTTCTTG 
TAATTTAAAA 
ATTTCAAAAC 
TCAAAAACCA 
AGGTCTGCAT 



TGATCTAGGA 
GGCTTGTGAC 
TCTTAAAATT 
CTCTCTTTTG 
GTGACAGCAG 
GATAAAAATG 
CTGTGC3CTAT 



ACTTtSkTATT 
CTAAATTTCT 
TTTGTAGTCA 
CCCATGTTAA 
CTAAAOAACA 
CCATCAAACT 
TTCATTCAAA 



CAGAGTTCTG 



AGCTCTCCAG CAGTTCTAGA AAAGCTTTGA 
CTTCACXGCT TCTTAGAAGG TASAATTAAG 
TTAACATTCA GAATTGGGAA TATTAATTTT 



1030 
1080 
1140 
1300 
1260 
1330 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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TCCAGTGAGT AGTTTTCTGA 
CAATTTTGTG TTTGTTTACT 
AAAACCTCAT GCCTTTTCAT 
TTTAAAGATG CTTTAATGAA 
CTTCAGAAAT CXXTATATTT 
AQATGaTATT TAAAATGAAT 
TTGGAAGCCQ CCAGCCATTC 
ATTTTATATT ACTATGGTAT 
CTTCTTAAAA CCATAACCTG 
ATAGAGATTC TTCTTTTATG 
AAGACATTAA CATAAGTCTC 
CCACATTAAA CAACCACGGC 
TGTGCCTGGT ATOGCCTCTG 
CCTTCATCAA GCACTTGCCA 
ACAACATCTG CAACTCTACC 
CCCCAAACAC AAAACCACTA 
CACACAACCA ACACACCACQ 
GACAACACAT CACATACACT 



Seq ID NO I 234 DNA aeQuence 
Nucleic Acid Acceasion St Eos s 
I 27-281 



Mil 
TTTAT6TAAA 
TACATCTAAT 
AAGTATTAA6 
GTCATATTTA 
GCCCAAAAAT 
ATGTAGAGAG 
CTGTGTACCA 
GCTTQCCTTT 



GCATAACTTA 
ACACATTCAC 
CTATCAACTQ 
AATCATAACC 
ACCAAACACC 
CACTACCCCX: 



AATTTGATAT 



AAAATATATA 



ATCTTGTACC 
TTTATAAGAA 
TATTTCTAAG 
TAGTGTTAAA 



CACX3AATOGT 
CTCTAACTTG 
CCAACCTAAA 



AAATAACOTA 
GTGAATTACA 
AACTT CAGT Q 
GATTTGTATG 
AACCTCXTTAA 
TTTGTCCAAA 
AATAATTTAA 
TATTCATTAT 
CACAAAATCC 
ATTACCAGTO 
AACATGAAOA 
TCCIAGGAAT 
CCTCCCTACT 
TACAACCTTA 
OACCCCCAAC 
CCACACaCC» 
CAAGCTAACA 
ACCCACCA 



CRGTTCTAAT 
CCAGAAGTGC 
TCAGTTTATA 
TTGGATAACT 
AGTTTATCTG 
AATTGTATGC 
TAAATTGGTA 
AACATTGTAT 
CATCKTCACA 
CTGACAACCA 
CCaTOCTATA 
TGTCTACOCT 
CCAACTCACC 
ACAAC3VCAAC 
CACACCCAOC 
ACCACAAACA 



2280 
2340 
2400 
2460 



2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
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I 1 I I 

GAGCTGGOGG GAAGACATGC ACCCCTTGAA GACCCAGAGA GAGGCCGTCT 
ATCAGACTGA GACACTTCXTT OTTTACAGGA GACTATAAAA 
GGGGCTGACX3 CCATTTTAGG CX:TCAGCCCA TCTGCACCCA 



ATGTTGCAGG 



ATAAAAACTG 
GATGCCAGGA 
TCAAGCCAAG 
CTCAAACCCG 



OTGCTCATTT 
OAAACAGTGT 
ACCAATCCAA 
TTGCTCTGAT 
GCAAGGAGAT 
AAGAGCTAGT 
CACTTGGCAT 
TTCAGCGGTT 
GGAAAGATGC 
CCAACAGQIO 



GAGCCTT6CA GAAAGCATTA ACGTGCTTTT 



TTTGCTTGTT GGCQCQCTCT 240 



TCTTTTGATG 
GCAA AGAAA A 
TTTATTTTTA 
TTTGACCTTG 
6ATTAAAACA 
AAATCTCAAC 



QCACTTCTCT 



TGACGTTTCA 
TTACGCCCAQ 
TAACAGCAGC 
CCTGTGACTG 
ACTCTATATC 
ATAAAAGACA 
AACTGAATTC 
AAATAATCTT 
TATTAGTAAT 
GCTTTTAAAG 
CCTCCATCTG 
TCTCATTTTC 



CTTGGAGACA 
AGACCAACGT 
CTTTCAGGCT 
ATCAAGAGCC 
CGCCAACAAG 
CA GGGGTAAA 
TTCTGTTTTT 
CTCTAOAACC 
aATGGCTTGQ 



AAAlSCAAAAG 
TOACaCGATC 
AGCAATAATT 
TTTTATTTTT 
CAACTCTGAG 
ATTTCCAGTA 
AGCA6AGATT 
TACATTGTAA 
TAATTATTAA 
CATTTGTACA 
ATTCTCATTT 
CACTGTCTCG 
GSAAAAQAAAA 



OAGATTCTCC 
GGGCTGGTOA 
AGGCGTGGAA 
AAGTGGTAAA 
GTGGGAAAAT 
CATCACAC3AA 
CATGCTGGTC 
GTCATCAGGA 



TAGTGGAAAC 
TTCATGCACT 
CCTGAGAAAG 
GACTAAAACA 
GTAGCAAAAA 



CTCTTTQOCA 
ATAAGGAATA 
CAAGAGAAAG 
AATGTCCAGC 
GGAAATGTTT 
TGGGGATGGA 



G6GAACCTGA AGCCAGGAGG 660 



GTGCTGAGGA 
CCAAGGCCCT 
ATCCAAAGGA 
CTACCTGATT 



TACCCTCAAA 
AAAGATGAGA 
AGGGAAACAG 
GTTTGATTAA 
AGTATGCCAG 
TACATGCATT 
ATTCTTAATG 
AGGAGAATAA 



TCATTGTCAG 
CAAGCTAGAA 
TTGTTCTGTA 



CTAATAAGTG 
ATCCATATCC 
CGTCCATTAC 
GCCAGTGAAG 
ACTTAAAAAA 
TGAGAACAAT 
AGAGGAAGAA 
AGAAATGACC 
TTOSAATTAA 
ACXSATGATTA 
ATCAAAACAA 
TTGCAAATAC 
CATTTTTTAA 
TGCAACAACA 
ATTCTCACGA 
CAGATATATG ACATTAAAAA 



TAAGAAAAAA 
GAATGATTTT 
TTGAACCACA 
TGATTTACTT 
AOVTCTGAAA 
GGTTCTCAGT 
AACATTCCTA 
ATTTGAAAAA 
AAAAAGGTAT 



1500 
1560 
1620 



55 
60 
65 



MHPUCTQRBA VCLPRSSYIR liRKFIiFTOnr XIPAFCSFGA DAIU3IiSPSA PSRSLKQCVA 
PBRLVLLVGA IiSOFRPIQEP OUCH 

Seq ID MO I 236 DMA sequence 
Nucleic Acid Acceasion #i NM_002075 
406. .1428 



70 
75 



CCACAATA60 6GCAGACCTO 1 
ACCCAGAGGC 
AATCTCAGCT 
GCTGAGGA6C 
GGCCAGCTCC 
QAGGGAGTAA 
CCCAAGAGCC 



AGTCCTTTCT 
&CCACCCTGA 
GGCCAGGCCA 
CGTCGCAGCT 



ACCACCAACA 

CTCAAATCCC 
CTCTCCTGCT 
TGTGCCTTGT 
GACTGCATOA 



GOACQTTAAa 
TQCrOGTAAQ 
AGGT6CACGC 
GGAACTTTGT 
QTGAGGGCAA 
GCCGCTTCCT 
GGQACATTGA 
GCCTOSCTQT 
AGCTCTGGGA 



CCTGCCTGTA CCCTCCCATA CTCACCAAAC CCTCITCCCX 180 

ACAGTTTGAG GCXXKCCCAA CCCCCCGCEG GTCGGGGCCa 240 

TCTGGCAGCA GAGCCTGGGC AGGTGACGGG CX3GGCGCX;GG 300 

GGAGGCTCCC AGGAACCGGA GCTGGAAACC CGGCCGAGGT 360 

AGAGTGACCC CTCGACX:TGT CAGCCATGGG GGAGATGGAG 420 

GCAGCTCAAG AAGCaGATTG CAGATGCX3W3 GAAAGCCTOT 480 

GCTGGTGTCT GGCCTAGAGG TGGTGGGACG AGTCCAGATG 540 

GGX3ACACCTG GCCAAGATTT AOGCCATGCA CTOGGCCACT 600 



TGXXTCGCAA 
CATCCCACTG 
GGCATGTGGG 
TGTCAAGGTC 
GGATGACAAC 
GACTGGGCAG 
GTCTCCTGAC 
TGTGCGACAG 



CGCTCCTCCr 
GGGCTGGACA 
AGCCX3GGAGC 
AATATTGTGA 
CAGAAGACTG 
TTCAATCTCT 



TC3AXC6TGT6 
OSGTCATGAC 
ACATGTGTTC 
TTTCTGCTCA 
CCAGCTCGGG 



CTGTGCCTAT 
C3VTCTACAAC 
CACAGGTTAT 
GGACACCACG 



'ATTTQTGGQ ACACAQQGGT 960 



279 



wo 02/086443 

GAGTCGGACA TCAACGCCAT CTGTTTCTTC CCCAATGGAG AGGCCftTCTG CACGGGCTCG 1140 

GATGACGCTT CCTGCCX3CTT GTTTGACCTG CXKSGCaVOACC AGGAGCTGAT CTGCTTCTCC 1200 

CACGAGAGCA TCATCTGCX3G CATCACGTCC GTGGCXrrTCT CCCTCAGTGG CCGCCTACTA 1260 

TTCGCTGGCT ACGACGACTT CAACTGCAAT GTCTGGGACT CCATGAAGTC TGAGCGTGTG 1320 

GGCATCCTCT CTGGCCACGA TAACAGGGTG AGCTGCCTGG GAOTCACAGC TQACGGGATG 1380 

GCTGTGGCXa CAGGTT(XTG GGACAGCTTC CTCAAAATCT GGAACTGAGG AGGCTGGAQA 1440 

AAGG6AAGTG GAAGGCAGTG AACACACTCA GCAGCCCCCT GCXXIGACCCC ATCTCATTCA 1500 

GGTGTTCrCT TCTATATTCC GGGTGCCATT CCCACTAAGC TTTCTCCTTT GAGGGCAGTG 1560 

GGQAGCATGG GACTGTGCCT TTGGGAGGCA GCATCAGGGA CACAGGGGCA AAGAACTGCC 1620 

CCRTCICCTC CCATGGCCTT CCCTCCCCAC AGTCCTCACA GCCTCTCCCT TAATGAGCAA 1680 

GQACAACCTG CCCCTCCCCA GCCXriTTGCA GGCCCAGCAfi ACTTGAOTCT OAGGCCCCftO 1740 

6CCCTAG6AT TCCTCCCCCA GAGCCACTAC CTTTGrOCAG QCCTGSGTGG TATAGGGOGT 1800 

TTGGCCCTGT GACTATGGCT CTGGCACCAC TAGGGTCCTG GCCCICTTCT TATTCATGCr 1860 

TTCTCCTTTT TCTACCTTTT TTTCTCTCCT AAGACAOCIG CAATAAAGTG TAGCa^CCCTG 1920 
GT 

Seq ID MO: 237 Protein sequence: 
Protein Accession «: NP_002066 

1 11 21 21 41 51 

MGEMEQLRQB AEQLKKQIAD ARKACADVTL AEI.VSGI.BW GRVQMRTHRT LRGHLAKIYA 60 

MHWATDSKIO. VSASQDGKLl VKDSymiKV HAIPLRSSWV MTCAYAPS<W FVACXaSIiDllM 120 

CSIYKLKSRE GNVKVSREIoS AHTGYLSCCR FU3DNNIVTS SGDTTCALWD lETGQQKTVF ISO 
VGHTGDOISL AVSPDFNLFI SGACDASAKL WDVREGTCRQ TFTGHBSDIM AICPPPMGEA 
ICroSDDASC HIJDLRADQE LICFSHESII CX3ITSVAPSI. SGRLLFAGYD D"'™""^"' 
KSERVGIIiSG HDNHVSCLGV.TADGMAVATQ SWDSFIiKIMN 



1 11 21 31 41 SI 

TCCCAATGTG THGAACCTAC C^TAAATTCT TTTCTTAOJG GACAATCTTA TNCTAANCAA 60 

TACCATTTGC TTTTAAGGCA GATAATCCTC CAAGTrTTCT AATGATATCT GAAACTATTA 120 

ACTGATTCTG TGAATTATQA AATCIGAAAA GGAATTGGAA GTTGCTAAAA ATCTATCATT IBO 

TGCATTGACC AGTGTGAAGC ACAGTGGAAT QAOAATGCGT GCCCTGACaiC CAAAGAAAAA 240 

TAAtSTGACTG GAAAGCTGAA GRRTCRCCGG CrTCAGTGAC ATGGAAOXa GTGATTTGAT 300 

TTTTGACGAG TATCXSGGTGA CTTTOAGGTQ GTCRAGAAAC CACaCTTTAA GAACAATGTC 360 

CAAAAAGGGG AAAAAAAAGA GCAACCAAAG AAAAAAAATC CATAAAATTQ CACAGAAGAA 420 

AAGAAAGAAA AATAAAATAC ACAATATGGA OGATGGAGAA AAACAGTTAC ATTTCTTTAT 480 

GGATCRAQAA GTTTGTCTAC ACATAATCIC ATTTTGAGAT ATATAACTAT TTTTGTCTTT 540 

CAGAAGTGAA TCAAAATATT TCAAAATGCT GTCTTATGAA ACTACAATAT TCTCACAGAT 600 

TAGAAAAGTT TTTCTGTAAA AGTCAGATAG TAAATATTTT AGGTTTTGCA GTGTCTTTTG 660 

CAACTACTCA ACTTTCCTAC TGTAGCACAA GAGTAGCTGT GGTTICTGTGC AAATAAATTG 720 

CTTGTOrrCC AATAAAGCrr CATTEACAAA AACRTGCCAT QQQCCATATT TGQ^TOTAC 780 

ACTCTTOTTT GCCAAGTCCT AATATAGTTG CTTAGCAAGT ATTOTOAGCT ATTTQAGOAA 840 

GACATGAAAG TTCAXTGGGT TGCTAAAAAG TATOTAGAAA TTCAAAGGAA AATTAAAATT 900 

TAGGCTAAOT TATAATACaC TGTTTTAACA ATTOTAAAAT GTAAGAQAAA TTTACAAATA 960 
AAAATCCCAA ATAAAA 

Seq ID NO: 239 DNA sequence 

Nucleic Acid Accession »: NM_001786.1 

Coding sequence: 130-1023 

1 11 21 31 41 51 

GGGGGGGGGG GGCACTTGGC TTCAAAGCTG GCTCTTGGAA AT1GAGCGGA GAGCGAOSCa 60 

6TTGTTGTAG CTGCCGCTGC GGCCGCCGCG GAATAATAAG CCGGGATCTA CCATACCCAT 120 

TGACTAACTA TGGAAGATTA TACCAAAATA GAGAAAATTG GAfiAAGGTAC CTATGGAGTT 180 

GTGTATAAGG GTAGACACAA AACTACAGGT CAAGTGGTAG CCATCftAAAA AATCWaCTA 240 

GAAAGTGAAG AGGAAGGGGT TCCTAGTACT GCAMTCGGQ ARATTTCTCr ATTAAAOGAA, 300 

CTTCGTCATC CAAATATAGT CAGTCTTCAG GATGTGCTTA TGCAG6ATTC CAGGTTATAT 360 

CTCaVTCITTG AGTTTCTTTC CATGGATCTG AAGAAATACT TGGATTCTAT CCCTCCTGGT 420 

CAGTACATGG ATTCTTCACT TCTTAAGAGT TATTTATACC AAA-TCCTACA GGGGATTGTG 480 

TTTTGTCACr CTAQAAGAGT TCTTCACaGA GACTTAAAAC CTCAAAATCT CTTQATrGAT 540 

GACAAAOGAA CAATTAAACT GGCTGATTTT GGCCTTGCCA GAGCTTTTGG AATACCTATC 600 

AGAGTATATA CACATGAGGT AGTAACACTC TGGTACAOAT CTCCASAAGT ATTGCTQGGG 660 

TCAGCTCGTT ACTCAACTCC AGTTGACATT TGGAGTATAG GCACCATATT TOCT OAACTA 720 

GCAACTAAGA AACCACTTTT CCATGGGGAT TCAGAAATTG ATCAAC TCTT CAGQATTTTC 780 

AQAGCTTTGG GCACTCCCAA TAATGAAGTG TGGCCRGAAfi TQGAATCTTT ACAGGACTAT 840 

AAGAATACAT TTCCCAAATG GAAACCAGQA AGCCTAGCAX CCCATGTCAA AAACTTGGAT 900 
QAAAAIGGCT TGGATTTGCT CTOGAAAATG TTAATCTATG ATCCAGCCAA ACGAATTTCT 
GGCAAAATGG CACTGAATCA TCCATATTTT AATGATTTGG ACAATCAGAT TAAGAAGATG 

TAGCTTTCTG ACAAAAAGTT TCCATATGTT ATGTCAACAG ATAGTTGTGT TTTTATTGTT 

AACTCTTGTC TATTTTTGTC TTATATATAT TTCTTTGTTA TCAAACTTCA GCTOrACTTC 1140 

GTCTTCTAAT TTCAAAAATA TAACTTAAAA ATGTAAATAT TCTATATGAA TTTAAATATA 1200 
ATTCTGTAAA TGTGAAAAAA AAAAAAAAAA AAAAA 



1020 
1080 



MEDYTKIEKI GBGTyGWYK GHHKTTGQW .AMKKIMiESE EEOVPSTAIR Bl! 
EMIVSIflDVI. »«DSRI.YLIP EFI.SMDI.KKy LDSIPPGQYM OSSLVKSYLY QILQGIVFCH 
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SRRVUQUILK POHLLIDDKa TIKLADFQLA RAFGIPISVY THEWTUiYS SPEVLUSSAS 
YSTPVDIWSI GTIPAEIATK KPI.FHGDSEI DQI.FRIFRAI. CTFNHBVHPB VESLODYKNT 
FPKHKFOSIA SHVKHLDENO lALLSKMLIY DPAKRISGKM AUIHPYFNDI. DNQIKKH 
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Seq ID NO: 241 DNA sequence 

Nucleic Add Accession «i NM_033379.1 

Coding sequence: 132-854 



GCTTTGCAQA 
ATTGACTAAC 
TTCTGTATAA 



GGQTAGACAC 



AACTTCGTCA 
ATCTCATCTT 
GTOVGTACAT 



TTGCTGAACT 
TCAGGATTTT 
TACAGGACTA 
AAAACTTG6A 
AACGAATTTC 
TTAAGAAQAT 
TTTTTATTGT 
AGCTGTACTT 
ATTTAAATAT 



TCCaVAATATA 
TGAGTTTCTT 
GGATTCTTCA 
GTCAGCTCGT 
AGCAACTAAG 
CAGAGCTTTG 
TAAGAATACA 



21 

■1 

TTTGTAGAGC 
CAGGGACTAT 
TATACCAAAA 
AAAACTACAG 
GTTCCTAOTA 
GTCAGTCTTC 
TCCATGGATC 
CTTGTTAAGG 
TACTCAACTC 
AAACCACTTT 
GGCACTCCCA 



GAGGGGCCa^. 



TGGCAAAATG 
QTAGCTTTCT 
TAACTCTTGT 
CGTCTTCTAA 
AATTCTGTAA 



TTGaATTTGC 
GCACTGAATC 
GACAAAAAGT 
CTATTTTTQT 
TTTCAAAAAT. 
ATGTGAAAAA 



TAGAGAAAAT 
GTCAAGTGGT 
CTGC3UiTTCG 
AGGATGTGCT 
TGAAGAAATA 
TAGTAACACT 
CAGTTGACAT 
TCKATGGGGA 
ATAATGAAGT 
GGAAACCAGG 
TCTCQAAAAT 
ATCCATATTT 
TTCCATATGT 
CTTATATATA 
ATAACTTAAA 



CTTGGCAGAG 
ACACGGGATC 
TGGAGAAGGT 
AGCCATGAAA 
GGAAATTTCT 
TATGCAGGAT 
CTTGGATTCT 
CTGGTACAGA 
TTGGAGTATA 
TTCAGAAATT 
GTGGCCAGAA 



51 
I 

CGCGCGGCCA 
TACCCATACC 
ACCTATGGAG 
AAAATCAGAC 
CTATTAAAGG 
TCCAGQTTAT 
ATCCCTCCTG 



AATGTAAATA 



GOCACCATAT 540 

GATCAACTCT 600 

GTGQAATCTT 660 

TCCCATGTCA 720 

GATCCAGCCA 780 

3 GACAATCAGA 840 

GATAGTTGTG 900 

ATCAAACTTC 960 

■rrCTATATGA X020 



11 



21 



I I 
MEDYTKIBKI GEGTYGWYK ~ 
PMIVSLQDVI. KQDSRI.Y1.IF 
SARYSTPVDI WSIGTIFAEL 
KNTFPKHKPG SLASKVKMLD 



31 



41 



SI 



I I I 

AMKXIRLBSE EEGVPSTAIR EISIiLKEIjRK 
LDSIPPGQYM DSSI>VRVVTL WYRSPEVUiG 
SBIDQLFRIF RALGTPimEV HPEVESLQDY 
LIYDPAKRIS G 



Seq ID MO: 243 DMA s 
Nucleic Acid Accession 9i AF101051.1 
Coding sequence: 221-856 



ACCTGCCACC 
GCTGTTGGGC 
GCCCCAGTGG 
CGAGGGGCTG 



GCCACCTTCG 
CCTGAGCCAG 
TTCATTCTCG 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 
GGAOTGATAG 



OGCGGGCGCC 
CCTTCCTGGG 
CCTATGCCGG 
GCGTGTCGCA 



CGAGCGAGTC 
ATGGATCGGC 
CX3ACAACATC 
GAGCACCGGG 
GCAAGCAACC 
GGCCACCGTT 



GCAAACTCTC 
ATGGCCAACG 
GCCATCGTCA 
GTGACOGCCC 
CAGATCCAGT 
CGTGCXrrTGA 
GGCATGAAGT 



ATTCTATGAC 



GAAAQACTAC 
GQACATTGAG 
GTATGGTATT 
AAACATGGCT 



CTOGCTATTT 
CCTATGAOCC 
GCTGCTTCTC 
ACCTCTTACC 
GTGTGACACA 
ATACTATCAT 



aoaagatqag oatggctgtc attgggggtq 
agcatgotat 
caggtacgaa 
gggaggtgcc 
gccx:tatcca 



TTTGGTCAGG 
CTACTTTGCT 
AAACCTGCAC 
TGTTGAAACA 
TTTGGGTATT 
GTGTTAAAAT 

TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAAG 
GGGAAGGGGT 



CAGTCAATGC 
TCTGCCTTCT 
CAACACCAAG 
GAGQCAAAAG 
TAACATTAGG 
CAAACAAACA 



AACTTCCTCC 
CGCCTTCTGC 
CGGGGCTGCA 
GCACTGCCCT 
AGGCCATGTA 
GCAAAGTCTT 
TGGTGGTTGG 
GTATGAAGTG 
CGATATTTCT 
TOGTTCAAQA 
CTCTCTTCAC 
GTTCCTGTCC 
CTTCCAGCGG 
AACCGAAAAT 
GTAATCTGAA 
ACTCAGTGCT 



CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 
TTATTTTTTA C 
TTTCATTGGT C 



TATOTATATA 
TGATACTAGC 
GAAGATGTTT 
TCATTTACTC 



6TGAXAAATT 



ACCTTTTTGT 
TATATCTTCC 
GATAATCTGG 
TCTTTTTTCT 
AATATTAATT 
TTTATTTGCT 
CrrCATGTGA 



AAACCTAC3GC 
ATTCITTCAG 
TTTCCAGTCT 



CCTGTTGAiCC 
AAATATTTOT 
TTGATTGAAT 
TCCOCATTCC 
TAATAAGGTG 
TGACAAATAT 
ATCTGCCAAA 
agtttatatt 
CAOCTGOCTG 
TTCACTGCCT 
TCATQTGGTT 
ACATACCTTC 
CTGTGTCTGA 



GAOTAATCAT 
TACATGTTTT 
ATACTTAAAA 
ATTGGTATAT 
TTCTTCATTA 
TCTTTCAATT 
ATAGCACTTG 
TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTGAGT 
TTTTAAGCTA 
TTAATTGTAT 

TCTCTCrOTA 
TTGAGATAAT 
ACTCTCATTC 
AGACACTGAA 
TCCTCTCTCT 



ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 



ATGTGGCTCA 
CA3GTTTGTG 
CTATTTCRCT 



GCTTTGGGTG 
CTTCATGaST 
CATCOTTATT 
ACATTTCATA 
TTTGGAGGCA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTOAACAA 
GCXGTAftGCA 
GATACTTAAC 
TTTGAACATG 
GAAGTCACTO 
ACCAGTCTAT 
CTCTCTCTAC 
GTGCCTTCCT 
CTCTGTTOCA 



ATAGGTAAAT 
GTCCTTATAT 
CCTTTGCCAC 
GCCCTTTTCA 
AAGCCCTTAT 
GCCTACATTT 
AATCTTTCTG 
TCTOACCCAT 
TGTTCCCCCA 
GTTTTATATC 
AGTGTAATTA 



AACTATGCCT 
AACAAAACCT 
TTCCACTGAA 
CAGTCTATTT 
CrCTCTACCA 
TTTTAACAAC 



GCTCCTTAAA 1140 



1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1930 
1980 
2040 
2100 
2160 
2220 



GTATTTAATT 
ACATATOIAA 
AAGACCTAGC 
TATACTTATT 
TTQTTTT6TO 
•rAGTTTCTAA 
CATGACX»AA 
AGCACTCTTG 



CCCCTAAACT 
TCATGCQTTT 
TTTCTGGAGT 
TCTTTCTACC 
AGGTAGTGX6 
ATQTAGTGTC 
ACACACGTAC 
CAAAACCTAC 
CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGTGTTa 



2340 
2400 
2460 
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GCACTGGTQT CTGGAGACCT QQATTTGAGT CTTGOTGCTA TCAATCRCCG TCTGTGTTTG 2520 

AQCAAiGGCAT TTGGCTGCTG TAAOCTTATT GCTTCATCTO TAAQOQSTGG TTTGTAATTC 2580 

CTQATCrrOC CaCXrrCACflG TGATCTTGTO GGGATCCAGT GAGATAGAAT ACATGTAAGT 2640 

GTGGTTTTGT AATTTGAAAA QTGCTATACT AAGGGAAAGA ATTQAGGAAT TAACTGCATA 2700 

CGTTTTGGTG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TGTTAAGAAA TGGGTTTCTT 2760 

GCCTTAACCA GTCTCTCAAG TGATGAGACA GTGAAGTAAA ATTSAGTGCA CTAAACX^T 2820 

AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATGCTTT CTGTGGCTAA 2880 

ACAGATGTAA TGGGAAGAAA TAAAAQCCTA CGTGTreOTA AATCCAACAG CAAGGGAGAT 2940 

TTTTGAATCA TAATAACTCA TAAGGTGCTA TCTQTTCAGT GATGCCCTCA GAGCTCTTGC 3000 

TGTTAGCXGG C3\GCTGACGC TQCTAGGATA GTTAGTTTGG AAATGGTACT TCATAATAAA 3060 

CTACACAAGG AAAOTCAGCC ACOGTGTCTT ATGAGGAATT GGACCTAATA AATTTTAGTG 3120 

TCCCTTCCAA ACCTGAGAAT ATATGCTTTT GGAAGTTAAA ATTTAAATGG CTTTTGCCAC 3180 

ATAC&TAQAT CTTCATGATG TGTOAGTGTA ATTCXaTGTO GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGOCCC AAAATOACCA ACGAAATTGT TACAATAGAA TTTATCX^AT 3300 

TTTGATCTTT TTATATTCTT CTACCACACX: TCQAAACAOA CCAATAGACA TTTTGOGCTT 3360 

TTATAATGGG AATTTGTATA AAGCATTACT CTTTTTCRAT AAATTGTTTT TTAATTTAAA 3420 
AAAABOAAAA AAAAAAAAAA AAA 



Seq ID IK>i 244 Protein sequence: 
Protein Accession «< AA016433.1 

1 11 21 31 41 SI 

I I I I I I 

MANAGLQIjLG filaflgmxg AIVSTALPQW RIYSYAGDMI VTAQAMYEGL HMSCVSQSTQ 60 

QIQCKVFDSIi mLSSTLQAT RAUWVGILIi GVIAIPVATV GMKCMKCLED DGVQKMRMAV 120 

IGGAIFLIAfl LAILVATAWY GNHIVQEFYD PMTPVNAHYE FGQALFTGWA AASLCLLGOA 180 
LLicCSCPRRir TSYPTPRPYP KPAPSSGKDY V 

Seg ID NO< 245 DKA sequence 

Nucleic Acid Accession ft: CAT cluster 

1 11 21 31 41 51 

TTTTTTTTTT ' ITiTriT ' r T T TTTTTCAAGG AGAGCACAAG GAACTTTATT AATGACTTTC 60 

TTAATGGTTA AATGCTGTTT ACCAAQTOAC CXAQAGaCAG CGTGGTTTAG TGGTTTCAAC 120 

AGCATG6TCC CGAGAGTCTG ACAAACCTCA GTTCAAATCC TTCTTTTGTC TTCACTTAGT 180 

TTTTCTTCCT GAGATTTAGT TTCTTCATCG TTAACauVTGA GGATATTAAT ATGTTTCACA 240 

CAGTTGTTAT GAAGAATGCA TATATTAGAA TGCCTGTAGT CTCAGCTACT CAGGAGGCTA 3 0O 

AGGTGGGGAG GTCGCTCAAG CCCAGGAATT CAAAGCTGCA ATGCATTATG ATTACAGCTG 360 

TTAATAGCCA CTGCACTTCA GCCTGGGCAA TQTAGTAAGA TCCCATCTCT GGCTOjGAGG 420 
GTCCTACGCC CAOGGAGTCT CGCTGATTOC TAGCACAGCA GTCTOAGATC AAACTQCA 

Seq ID HO: 246 DNA sequence 

Nucleic Acid Accession #: XM_0S8S53.2 

Coding sequence: 897-1400 

1 11 21 31 41 51 

1 I I I I I 

AATTTTCAGA AGTTTCGTAT GGGGATGGTT TTATATAAAT TCAGGTTTTT CCCACAATAA 60 

TAAATGTATT TAOTCTCROT GCTCAATAGA AQAOATTTCT AATAGAAAAG GATTCAAACT 120 

GT6AAACCAT TTCTCTTTTA ATSTTTCACA TTCCTOTTAC AQATTTGTTC TCTTGTGACT ISO 

CTCTTATCCA TAATATGGAC A«5TTCTTGAG TCCTAACMT aAaAGGTTTT CCCTTAQTGC 240 

ATAGAGGGAA TGAGTATTAA TTGGAGAAGC TTAAAGTATT GCCACTTTAG CRCTGAAGAT 300 

TGGGATGAGA GOAGGTaAAA CCTCACTAGA AAAAGGGACA ATGTTAGTGT GGCCCTTCCT 360 

GATCATGTTT AAGAAAAGTC ATGAAAATGG TGAACTAGTG TTTCCAAGCA TATTGGAAGQ 420 

GTTGAGTGTA TACTGTCTGT CAAAGACTTC CAGCATTTCC AGGTCCTAGA GAGGAACAAG 480 

ACTGGTAACC TGCCTATCTG TATTTTTAAG AACCCAGGAG GAAAGCTTTA TAATAGAACA S40 

TTATTTCTGT GTTTATGTAT AAGGGGTTTT TTGTTTTTTT AAAGACAGGA TCTCACTCCA 600 

TTCTCCAGGC CAAGTGCAAT GGCACGAACC TCATAGCTCC TGOACTTAAG TGATCTGCCT 660 

GCCTTTGCCT CCTQRGTAGC TGGGACTACA GGCATGAGCC CCCATGCCTG GCTAAGTTTa 720 

rrrri ' rmri ' tgtttotttg •r r i-G-i T fTTG gggggggttg ' rrn - Grri ' J ' r tgtagagagg 7ao 

TAGTCTTQCT TTOrTGCCAQ GCTAGTCTCA AACTCCTGGC TTCAAGTGAT CCTCCTGCCT 840 

CAGCCTCCCA GAGTQCTAGG ATTACAGCAC TTGGATTCAG CTTCTTCATT TCCAACATGG 900 

AAGAAACTTA CACCGACTCC CTGGACCCTG AGAAGCTATT GCAATGCCCC TATQACAAAA 960 

ACCATCAAAT CAGGGCTTGC AGGTTTCCTT ATCATCTTAT CAAGTGCAQA AAQAATCATC 1020 

CTGATGTTCC AAGCAAATTG GCTACTTGTC CCTTCAATGC TOGCCACCAG OTTCCTCX5AO 1080 

CTGAAATTAG TCATCATATC TCAAGCTGTG ATGACAGAM3 ITGTATTGAG CRAGATGTTG 1140 

TCAACCAAAC CAQGAGCCTT AGACAAGAGA CTCTGGCTGA GAGCACTTGG CAGTGCCCTC 1200 

CTTGCQATGA AOACTGGGAT AAAGATTTGT GGGAGCAGAC CAGCACCCCA TTTGTCTGGG 1260 

GCACAACrCA CTACTCIGAC AACAACAGCC CTGCGAGCAA CATAGTTACft GAACATAAQA 1320 

ATAACCTGGC TTCAGGCATG CQAGTTCCCA AATCTCTGCC GTATGTTCTG CCATGGAAAA 1380 

ACAATGGAAA TGCACAGTAA CTGAATACCT ATCTCATCAA ATGCCRGACC CTAGAAQACT 1440 

GTTQCTTCTT CTTCTACCAG TGGGTTCTCA TTTTC CTCCT AATCTAATTA TAGAATGGTA 1500 

AACTCCCTST GACTTTCCAA ACTGACAAGC AOVCTTTTTT CCTCCCCCCT TOAATCCTCA 1560 
TTTAATGCAA GAACCCTCAT ACTCAGAAGC TTCCAAATAA ACCTTTGATA CAGATTG 



Seq ID NOj 247 Protein sequence: 
Protein Accession #: XP_0S8SS3.l 

1 11 21 31 41 51 

I I I III 

MEETYTOSLD PEKLLQCPYD KNHQIRACRF PYBLIKCBKH HPDVASKLAT CPFHARHOVP 60 
HAEISHHISS CDDRSCIEQD WNQTRSUtQ BILAESTWQC PPCDBDNDXD IMGQTSTPFV 120 
HGTTBYSDiai SPASNIVTEH KKmASGttBV PKSLPYVLPH IQIKGNAQ 
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1 11 21 31 41 51 

TTAAGGAAAT COSGGCTGCT CTTCCXICATC TGGAAGTGGC TTTCCCCACA TCGGCTCGTA 
AACTGRTTAT GAAACATACG ATQTTAATTC GQAQCTGCAT TTCCCAGCTG GGCACTCTCG 
CGCGCTGGTC CCXXKXXXXT CXSCCXXXXAC CCCXTTGCCCT TCCCTCCCGC GTCCTGCCCC 
CATCCTCCAC CCCXXS30GCT GOCCACCCCXJ CCTCCTTCGC AGCCTCTGGC GGCAGCGCGC 
TCCACTCGCC TCCCGTGCTC CTCTCQCCCA TG6AATTAAT TCTG6CTCCA CTTGTTGCTC 
□GCCCAQGIT GGGGAQAGGA CGGAGGGTGG CCGCAG0C3GG TTCCTGRGTG AATTACCXUW3 

GAaaaACTOA gcacswscacc aacttagagag gggtcagggg gtgcggoact cgagcgagca 

GGAAGGAGGC AGCGCCTGGC ACCAGGGCTT TGACTCAACA GAATTGAGAC ACGTTTGTAA 
TCGCTGGCGT GCCCCGCGCA CAGGATCCCA GCGAAAATCA GATTTCCTGG TGAGGTTGCG 
TGGGTGQATT AATTTGGAAA AAGAAACTGC CTATATCTTG CCATCAAAAA ACTCACGGAG 
GAGAAGCGCA GTCAATCAAC AGTAAACTTA AGAGACCCCC GATGCTCCCC TGGTTTAACI 
TGTATGCTTG AAAATTATCT GAGAGGGAAT AAACaTCTTT TCCTTCTTCC CTCTCCAGAA 
QTCCRTTGGA ATATTAAGCC CAGOAGTTGC TTTGQGGATG GCTGGAAGTQ CAATGTCTTC 
CAAGTTCTTC CTAGTGQCTT TGGCCATATT TTTCTCCTrC GCCCAGGTTG TAATTGAAGC 
CAATTCTTGG TGGTCGCTAG GTATGAATAA CCCTGTTCAG ATGTCAGAAQ TATATATTAT 
AOGAGCACAG CXTCTCTGCA GCCaiACTGGC AGGACTTTCT Ca^AGGACAGA AGAAACTGTG 
CCaCTTGTAT CAJ3GACCACA TGCAGTACAT OSGAGAAGGC GCGAAGACAG GCATCAAAGA 
ATGCCAGTAT CAATTCCGAC ATCQACGGTG GAACTGCAGC ACTGTGGATA ACACCTCTST 
TTTTGGCAGG GTGATGCAGA TAGGCAGCCG CX3AGACQGCC TTCACATACG CCGTGAGCaC 
AGCAOGGGTa GTGAACGCXa TGAQCCGGGC GTGCCGCGAG GGOGAGCTGT CCACCTGOGQ 
CTOCAOCCOC QCCQCSSOQCX: CCAAOGACCT GCCGOGGGAC TGGCTCTGGG GCGGCTGOGG 
CX3ACAACATC SACTATGGCT ACCXSCTTTGC CAAGGAGTTC GTGGACGCCC GCX5AGCGGGA 
QCGCA'reCAC GCCAAGGGCT CCTACGAOAa TGCTCGCATC CTCATGAACC TGCACAACAA 
CQAGGCOGGC CGCAGGACGG TQTACAACCT GGCTGATGTG QCCTGCAAGT GCCATGGGGT 
GTCCGGCTCA TGTAGCCTGA AGACATGCTO GCTGCAGCTG GCAQACTTCC GCaUUSGTOGG 
TGATGCCCTG AAGGAGAAGT ACGAC3U3CGC GGCGGCCATG CGGCTCAACA GCCOGGGCAA 
GTTGGTACAG GTCAACAGCC GCTTCAACTC GCCCACCavCA CAAGACCTGG TCTACATC3GR 
CCCXaVGCCCT GACTACTGCG TGCGCAATGA GAGCACCGQC TCGCIGGGCA CGCAGQOCCG 
CCTOTGCAAC AAQAGGTCGG AGGGCATGGA TGGCTGCGAG CTCATGTGCT GOSGCCGTGG 
GTACSACCAG TTCAAOACCQ TGCAGACGGA GOGCTGCCAC TGCAAGTTCC ACTGGTGCTG 
CTACQTCAAa TGCAAGRAGT GCAOSGAGAT C6TGGACCAG TTTGTGTGCA AGTAGTGGGT 
GCCACCCAGC ACTCABCCCC GCTCCCAGGA CCCGCTTATT TATAGAAAGT ACAGTGATTC 
TGGTTrnGG TTTTTAGAAA TATTTTTTAT TTTTCCCCSUV GAATTGCAAC CGGAACX»TT 
TTTTTTCCTG TTACXATCTA AGAACTCTGT GGTTTATTAT TAATAJ TATA ATTATTATTT 
GGCAATAATG GGGGTGGGAA CCACGAAAAA TATTTATTTT GTGGATCTTT GAAAAGGTAA 
TACAAGACTT CTTTTGGATA GTATAGAATO AAGGGQGAAA TAACAC ATAC CCTAACTTAG 
CTGTGTGGGA CATGGTACAC ATCCAGAAflG TAAA6AAATA C3WTTTCTTT TTCTCAAATA 
TGCCATCATA TGGGATGGGT AGGTTCCAGT TGAAAGAGGG TGGTAOAAAT CTATTC31CAA 
TTCAGCTTCT ATGACC3VAAA TGAGTTGTAA ATTCTCTGGT GCAAGATAAA AGGTCTTGGG 
AAAAC3VAAAC AAAACAAAAC AAACCTCCCT TCCCCAGCAG GGCTGCTAGC TTGCTTTCTG 
CATTTTCAAA A1GATAATTT ACAATGGAAG GACAAGAATG TCATATTCTC AAGGAAAAAA 
GGTATATCAC ATGTCTCATT CTCCTCAAAT ATTCCATTTG CAGACAGACC QTCATATTCT 
AATAGCTCAT GAAATTTGGG CaGCAGGGAG GAAAGTCXXC AGAAATTAAA AAATTTAAAA 
CTCTTATGTC AAGATGTTGA TrTGAAQCTG TTATAAGAAT TGGGATTCCA GATTTGTAAA 
AAGACCCCCA ATQATTCTGG ACaCTAGATT ■ I T i ' I Gl i 'l GG GGAaOTTGGC TTGAACATAA 
AT6AAATATC CTGTATTTTC TTAGGGATAC TrGGXTAGTA AATTATAATA GTAGAAATAA 
TACATGAATC CCATTCACAG GTTTCTCaGC CCAAGCAACA AGGTAATTGC GTGCCATTCA 
GCACTGCACC AGAGCAGACA ACCTATTTGA GGAAAAAC3VG TGAAATCX3VC CTTCCTCTTC 
ACACTGAGCC CTCTCTGATT CCTCCX3TGTT GTGATGTGAT GCTGGCCACG TTTCCAAACH 
GCAGCTCCAC TGCGTCCXrCT TTGGTTGTAQ GACAGGAAAT GAAACATTAQ GAGCTCTGCT 
TOGAAAACAG TTCACTACTT AGGGATTTTT GTTTCCTAAA ACTTTTATTT TGAGGAGCAG 
TAGTTTTCTA TOTTTTAATO ACAGAACTT6 GCTAATGGAA TTCACAGAGG TGTTGCAGCQ 
TATCACTOTT ATOATCCTGr OTTTAGATTA TCCACTCATG CTTCTCCTAT TGTACTGCAG 
GTGTJVCCriA AAftCIGTTCC CAGIOTACTT GAACAQTTOC ATTTATAACG GGGQAAATOT 
GGTTTAATGG TOCCTGRTAT CTCAAAGTCT TTTGTACATA ACATATATAT ATATATACAT 
ATATATAAAT ATAAATATAA ATATATCTCA TTGCAGCCAG TGATTTAGAT TTACAGCTTA 
CTCTGGGCTT ATCTCTCTGT CTAGAGCATT GTTGTCKTTC ACTGCAGTCC AGTTGGGATT 
ATTCCAAAAG TTTTTTGAGT CTTGAGCTTG GGCTGTGGCC CCGCTGTGAT CATACCCTGA 
GCACGACS3AA GCSkACCTCGT TTCTGAGGAA GAAGCTTGAG TTCTGACTCA CTGAAATGOS 
TGTTGGGTTG AAGATATCTT TTTTTCTTTT CTGCCTCACC CCTTIGTCIC CAACCTCCAT 
TTCTGTTCAC TTTGTGGAGA GGGCATTACT TGTTCGTTAT AGACATGGAC GTTAAGAGAT 
ATtCAAAACT CAOAAQCATC AQCAAT6TTT CTCTTTTCTT AGTTCATTCT GCAGAATGGA 
AACCCAIGCC TATTAGAAAT GACAOTACIT ATTAATTGAG TCCCTAAGGA A TATTC AGCC 
Ca«:PAC»TAG ATAGCTTTTT ■ wnrnT T TrrTTTTTAA TAAGGACACC TCTTTCCAAA 
CAGGCXaTCA AATATGTTCT TATCTCAGAC TTAOGTTGTT TTAAAAGTTT GGAAAGATAC 
ACATCTTTTC ATACCCCCCC TTAGGAOaTT GGGCTTTCAT ATCACCTCSG CCaACTGTGG 
CTCTTAATTT ATTGC3VTAAT GATATCCACa TCAGCCAACT GTGGCTCTTT AATTTAXTGC 
ATAAT6ATAT TCACATCCCC TCAGTTGCAG TGAATTGTGA GCAAAAGATC TTGAAAGCAA 
AAAGCACTAA TTAGTTTAAA ATGTCACTTT TTTGGmTT ATTATACAAA AACCATGAAG 
TTVCTTTTTTT ATTTGCTAAA TCAGATTGTT CCTTTTTAGT GACTCATGTT TATGAASAGA 
GTTGAGnTA ACAATCCTAG CTTTTAAAAG AAACTATTTA ATGT AAAATA TTCTACATaT 
CATTCAGATA TTATGTATAT CTTCTAGCCT TTATTCIGXA CTTTTAATGT ACATATTTCT 
GTCTXGCGTG ATTTGTATAI TTCACT GGTT TAAAAAACAA ACATCQAAAG GCTTATTCCA 
AATG6AAGAT AGAATATAAA ATAAAAC3GTT ACTTGTAAAA AAAAAAAA 



1140 
1200 
12fi0 
1320 

1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 

2040 
2100 
2160 
2220 
2280 
2340 
2400 



2700 
2760 
2820 

2940 
3000 
3060 
3120 



3420 
3480 
3540 
3600 



3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 



85 
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10 
15 
20 
25 
30 
35 
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MAGSAMSSKP FLVALAIFFS FAQWIEANS WWSLGMNNPV QMSEVYIIGA QPLCSQLAGI. 
SQGQKKLCBL YQDHKQYIGE GAKTGIKECQ YQFRHRRHNC STVDNTSVFG RVMQIGSRBT 
APTYAVSAAC WNAMSRACR EGELSTCGCS RAARPKDLPR DWLWGGCGDN IDYGYRFAKE 
FVDARERERI HAKGSYESAR ILMNLHNNBA GRRTVYNliAD VACKCHGVSG SCSLKTCMW} 
LADFRKVGDA LKEKYDSAAA MRDNSRGKIjV QVNSRFNSPT TQDI.VYIDPS PDYCVRNEST 
GSLGToaRLC NKTSEGMDGC BLMCCGRGYD QFKTVQTERC HCKFHWCXTV KCKKCTEIVD 
QFVCK 

Seq ID NO: 250 DMA seguence 
Nucleic Acid Accession i: NK_0 14058 
Coding aecjuencei 56.. 1324 



PCT/US02/12476 



TGACTTCQAT 
TOOGCOVOAT 
CGTCATCTTC 
GAGATATAAT 
ACTATATGCT 
TGAATCAATG 
TCAGGTTATC 
TAGATTTCAC 
TGAAAAGCTG 



GTAGACCTOG / 



TAAAACTCTA 
GCCCTGGCAO 
TGCCACATGG 
GACTGCTTCX: 
AATTGTCCAT 
TTCTAGOCCT 
TGAGTTTCAA 



TOAACCTCAA 



ATATCCCTGA T 

GRGTTTGGCA 
GTGAAAAATG 
AAGTTCAGTC 
TCTACTGAGS 
CAAGATGCTQ 
AAGACAGAAA 
GGTCAGAGTC 
GCTAGCCTGC 
CTTGTGAGTG 
TTTGGASTAA 
GAAAAATACA 
GTTCCCTACA 
CCAGGTGATG 
AATCATCTTC 
QCTTACAATG 
QATGCATGCC 



CCTACAATTA 
GAGAGGCTTC 
CATTTTATAA 
AACAGAAGCA 
ATCCTGAAAC 
TAGGACCCCC 
CAQACAGCTA 
TCAGGATCGT 
AGTGGGATGG 
CTGCTCACTG 
CAATAAAACC 
AACACCCATC 
CAAATGCAGT 
TGATGTTTGT 
GACAAGCACA 
ACGCCA7AAC 
AGGGTGACTC 

AGATATCTGG TACCTTGCTG GAATAGTGAG 



31 
I 

GACTCTTCAT 
AGTTTGTTGG 
AGTGTGCATT 
CTATAGCACA 
TAACAATTTT 
ATCTCCaXTA 
TGGAGTQTTG 
TGTAGATAAA 
TAAAGTAGAT 



I 



GAACCCTGG6 
GGACTCACTG 
TTGTCATTTA 
ACAGAAATGA 
AGGGAAGAAT 
GCTCATATGC 



CAATGATGTA 
TTATGGGCCT 
TTCATTATGT 
CAACTGACAA 



TTGTCAAGTC 
TGTTGATTTG 
TTGTTTTACA 



TGGTGGGACA 
GAGTCATCGC 
TTTTACAACR 
TTCGAAAATG 
ACATGACTAT 
ACATAGAGTT 
GACAGGATTT 
GGTGACTCTC 
TCCTAGAATG 



a CACGAAOAAG 600 



CCTTAATTAA 
CTGCCACATQ 
TCCGGAGAAT 



TGTCTCCCTG 
GGAGCACTGA 
ATAGACGCTA 
TTATGTGCTO 



ATGCATCCTA 
AAAATGATGG 
CAACTTGCaUV 
QCTCCTTAGA 



AATAAACTGT 



AAAGCCTCAT 
AGATACAGAA 
TTGCTT6ATG 



CIGGOSAGKr GAATQTOCaA AACCCAACAA 
GCGGOACTGa AT TACTTCAA AAACTGGTAT 
ACATTTTTTT TTCTTTTTTS GOTGTGGAGG 
CTTGCAAAAC AOCTAGATTT GACTGATCTC 



1200 
1260 
1320 
1380 
1440 



45 
50 
55 
60 
65 
70 
75 
80 



11 



31 



51 



I I I I I I 

MYRPDWRAR KRVCWEPWVI GLVIFISIjIV LAVCIGLTVH YVRYNQKKTY NYYSTI^FTT 
DKLYAEFGRE ASKNPTEMSQ RLESMVKNAP YKSPLREEFV KSQVIKFSQQ KHOVLAHMIil. 
ICRFHSTEDP ETVDKIVQIiV LHEKLQDAVG PPKVDPHSVK IKKIHlCrETD SYUWCCXJTR 
RSKTLGQSIiR JVGGTEVEEG EWPHQASIflW DGSHRCX3ATL INATWLVSAA HCPTTYKKPA 
RWTASFGVTI KPSKMKRGIiR RIIVHEKYKH PSHDYDISIaA ELSSPVPYTM AVHRVCLPDA 
SYEFQPGDVH FVTGFGALKN OOySQKDII>RQ AQVTUDATT OIEPQAYKDA nVKHUMSS 
LEGKIDACQ6 DSGGPLVSSD ASOIHYIAOI VSHODECAKP NT" ' 

or 

Seq ID NO: 252 DNA sequence 
Nucleic Acid Accession «> MM_003504.2 
71-1771 



r AIiRDWITSKT 420 



31 



41 



51 



GGCACGAGGC 



GAGCGTCCTT 



CTCGTGCCGC 
ATGTTCGTGT 
CTCTTCXJTGG 



GCATTTCTTG 
GTAGACCTAT 
CATAGGCCAG 
CAAGATGATG 
GAAGAGCATT 



CGGGCTCTTG 
CCGATTTCaS 
CCTCGGACGT 
AOSTGCAATA 



I I I 

GTACCTCAGC GCGAOCGCCA GGCGTCCXMC 
CAAAGAGTTC TACGAGGTGG TCCaGAGCCA 
GGATGCTCTG TGTGCGTGGA AGATCCTTCA 
TACGCIGGTT CC AGTTT CTO GGTQGCAAGA 
TATTTTATTC TCATAAACTB 



TGGATATTCT TCAACCTGAT GAAOACACTA TATTCTTTOI 360 



CCTGQTOCTC 
AGCC3VGGTTC 
CATGGGTCrr 
GGAGAATTTG 
CGTGCAGACT 
CTTTGCCACC 
CATCCAGGCT 
ACTOGCCAAO 
CCTCSTCATC 



AACACACTCT 
TACCAGCACT 
AAGCTGTGGT 



ACCTTQAAGT 
CAGGAAATGA 
TAGTGGAGCA 
TCCTCTTTGA 
AGCTGGCTTG 
TAACAQACCA 



CAATGTATAC 
TCCCGCCTAT 
CAaTQATGGG 
AACCATGCGG 
CTACXSAGCAQ 
GATGCTGTCX: 
GTG GGTG CAA 
CCAOQTTTCC 



AACGATACCC 
GAAGACATCT 
TCAGAGCCTT 
AGGAGGCAGC 
TATGAATATC 
AAGGACCTGA 



OQCCACAACC 



TTGAAQAGTC 
ATTTTGGGTT 
T66AGAGCCC 
CTG6ACAGCC TCTCCAGOAQ TAACCTGGAC A 



ACA6AAGCG0 
GAAGTTCCAG 
TGCAAATAAA 
CAAGCACAAG 



TGt3UM»CCA 
CTCX3«5QAGT 
GCCATGGACA 
TTTGGGATGA 
TTTCTGGCCA 



CTGAGAAGCG 
GGCGAGAGTG 
ATGGGACATC 
ATOACATGCT 
CTCAAjyiGAA. 
ACCBQAAOQA 
ATGACCTCCO 



S ACAAAQAACC C 



AGATCAAATT 420 



1020 
10 80 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



TCCTTGCAGA 
TCTCCTTGAA 
AGGACATGCG 
GCGACGTGGT 
CAGATCACTT 



TTTOCACCAA 
CTCCAGATGT 
TCAAGTCCTT 



284 



wo 02/086443 

GAGCATGGAG CaVTGGCACAG TGACGSTGGT GGGCATCCCC CCAGAGACCG ACAGCTCGQA 1620 

CAGGAAGAAC TTTTTTGOGA GGQC6TTTGA GAAGGCAGCG GAAAGCRCCA GCTCCOGGAT 1680 

GCTGCACAAC CATTTTGAOC TCTCKSTAAT TQAGCIQhAA GCTGAG8ATC QI3AGCAAOTT 1740 

TCTGGACGCA CTTATTTCCC TCCTGTCCTA GGAATTTOAT TCTTCCAGAA TGACCTTCTT IBOO 

ArrTATGTRA CTQGCTTTCA TTTAGATTGT AAGITATGGA CATGRTTTCA GATeTAGAAO I860 

CCATrrTTTA TTAAATAAAA TGCTTATTTT AGOCTCOGTC CCCAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AA 



I I I I I I 

MFVSDFRKEP YEWQSQRVI. I.FVASDVDAL CACKILQALP QCDHVQYTLV PVSGWQEIiET 
AFIiEHKEQFH YFILINCX3AN VDLLDILOPD EDTIPFVCDT HRPVNWNVY NDTQIKLLIK 
ODDDIiEVSAY EDIPRDBEED EEHSGNDSDG SEPSEKRTRL EEEIVBQTMR RRQBREWEAR 
RRDILFDYEQ YEYHGTSSAM VMFEIiAWMIiS KDUn3MI.WWA IVGLTOQWVQ DKITQMKYVT 
DVOVLQBHVS SHNHRNEDBE NTLSVDCTRI SFEYDLRLVL YQHWSLHDSI. CNTSYTAARF 
KLWSVHGQKR LQEFIASHSIi FXiKQfVKOKFQ AMDISUCENI. REHIEESANK FaMKDMRVQT 
FSIBFOFKHK FLA5DWFAT MSLMESPEXD GS6TDRFIQA UJSLSRSKU) KLYBGLBLAK 
KQLRATQQTI ASCLCTNLVI SQGFFLYCSIi MBQTPDVMIiF SRPASLSIiLS KHIiLKSFVCS 
TKMRItCKIiI>P IiVMAAPbSMB HQTVTWaiP PETDSSDRKH FFGRAFEKAA ESTSSBHUDI 
KFDLSVIELK AEDRSKFLDA LISLLS 

Seq ID NO: 2S4 DMA sequence 
Nucleic Acid Accession #: MM_022337 
Coding sequence i 48 . . 683 



1 11 21 31 41 51 

GGCTGCGCTT CCCTGGTCAG GCACGGCAOG TCTGGCCGGC CGCCAGGATG CAGGCXCOGC 60 

ACAAGGAGCA CCTGTACAAG TTGCTGaTGA TTGGCOACCT GGGCGTGGGG AAGACCAGTA 120 

TCATCRAGCG CTACGTGCAC CAGAACTTCT CCTCX3CACTA CCQGGCCACA ATCGaOSTOQ 180 

ACTTCGCGCT CAAGGTGCTC CACTGGQACC CGGAGACTGT GGTGCGCCTG CAGCTCTGGG 240 

ATATOGCa(36 TCAAGAAAGA TTTGOAAACaV TGACGAGGGT CTATTACCGA GAAGCTATGG 300 

GTGCATTTAT TGTCTTOGAT GTCACCAGGC CAGCCACATT TGAAfiCAGTG GCAAAGTGGA 360 

AAAATGATTT GGACTCCAAa TTAAQTCTCC CTAAXGGCAA ACCGGTTTCA GTGGTTTTGT 420 

TGGCCAACAA ATOISACCAa GGSAAGQATG TGCTOWGAA CAATGGCCTC AAGATGGACC 480 

ACrrTCTQCAA OGAGCACGGT TTCGTAGGAT GGTTTGAAAC ATCAOCAAAG GAAAATATAA 540 

ACATTGATGA AQCCTCCAGA TOCCTGGTGA AACACATACT TGCAAATGAG TGTGACCTAA 600 

TCGAGTCTAT TGAGCCGGAC GTCGTGAAGC CCCATCTCAC ATCAACCAAG GTTGCCAQCT 660 

GCTCTCGCTG TGCCAAATCC TAGTAGGCAC CTTTGCTGGT GTCTGGTAGG AATGACCTCA 720 

TTGTTCCACA AATTGTGCCT CTATTTTTAC CATTTTGGGT AAACGTCAGG ATAGATATAC 780 

CACATGTGGC AAGCCAAAGA TCTATGCCTC TGTTTTTTCA ATGAGAGAGA AATAGCAAAT 840 

6TTCTTTCTA TGCTTTCCTC ACCXTCATCA CAGTGTTTAC AAACTTTTGA AAATATTTAQ 900 

TCTQTTACAA ACTTCTGTCA TGTAGCTOAC CAAAATCCTG CAGGGCCACA GTCGGCACTG 960 

TTATTTGCTT CTTTTAATCA GCAAAGGCCT CAAGTCTTAA AATAAAAGGG GAGAAGAACA 1020 

AACTAGCTGT CAASTCAAGG ACTGGCTTTC ACCTTGCCCT GGTGTCTTTT TCCAGATTTC 1080 

AATATATTCT CTQATGGCCT GACA(»KCTA TTAAGTAGAT GTGATATTTT CTTCCAAGAT 1140 

GACCTCCATT CTCGGCAGAC CTAAOAlSTTa CCTCTGAGTT AOCICTTTGO AATCGTGAAC 1200 

ACAGGTOTGC TATATTGTCC TTGTCCTAAC TOTCACTTCC CATGSCCTGA ATGT TGGC TT 1260 

AACTGAATAT TGTATGAAAA GACATGCCTC CATATGTGCC TTTCTGTTAO Cl'Cf Cm'GR 1320 

CrCAAOCTGT GGGGCTCCTC TATACATQCT ATACATGXAA TATATATTAT ATATATTTTT 1380 
GCAAOTGAAC AATAAAACAT TAAAAGATAA AA 

Seq ID NOt 255 Protein sequence; 
Protein Accession S: HF_071732 



1 11 21 

I I I 

MOAFHKEHI.Y KLLVIGDLGV GKTSIIKRYV 
LQLWDIAGQE HFGIOTTRVYY REAMGAFIVF 
SWLLANKCD QGKDVLMNHG LKMDQPCKEH 
BCDIiMESIEP DWKPHLTST KVASCSGCAK 

Seq ID NO: 256 DMA sequence 
Nucleic Acid Accession «: NM_016321 
Coding sequencei 25.. 1464 

1 11 21 31 41 SI 

GGAACOGCCC GCTGCCAGCC CGOCCAGGCA CCCCTGCAGC ATGGCCTGGA ACACCAACCT 60 

CCGCTGGCGG CTGCCGCTCA CCTGCCTGCT CCTGCRGGTG ATTATGaTQA TTCTCTTCGG 120 

GGTQTTOGTG CQCTACGACT TCGAGGCCGA CGCCCACTGG TGGTCAQAGA GGACGCACAA 180 

GAACrXGAGC GACATGGAGA ACX3AATTCTA CTATCGCTAC CCAAGCTTCC AQGACGTGCA 240 

OG'TOATGGTC TTOGTGGGCT TOGGCTTCCT CATQACTTTC CTGCAGCQCT AOQOCTTCA6 300 

COCCOiaOGC TTCAACTTCC TGTTGGCRGC CTTOQOCATC CAGTGGGOGC TGCTCATGCA 360 

aaacxGorrc CAcrrcraAC aagaccgcia GATcoroara gqcgtgoaga acctcatcaa 420 

CGCTOACTTC TGCGTGGCCT CTQTCTOOGT GGCCTTTGQG GCAGTTCIGG GTAAAGTCAG 480 

CCCCATTCAG CTGCTCATCA TGACTTTCTT CCAAGTGACC CTCTTCGCTG TGAATGAGTT 340 

CATTCTCCTT AACCTGCTAA AGGTGAAGGA TGCAG6AQQC TCCTTGACCA TCCACACATT 600 

TGGCBCCTAC TTTGOGCTCA CAGTGRCCCG GATOCTCTAC CGAOGCAACC TAGAGCAGAG 660 

CAAGGAGAGA CAGAATTCTQ TGTACCftGTC GGACCTCTTT GCCATGATTG GCACXXTCTT 720 

CCTGTGGATG TACTGGCCCA GCTTCAACTC AGCCATATCC TACCATGGGG ACAGCCAGCA 780 

CCGAaCCQCC ATCAACACCT ACTGCTCCTT GGCAfiCCTGC QTGCTTACCT OGGTGGCAAT 840 



31 41 • SI 

I I I 

HQMFSSHYRA TIGVDFALKV MMDPBTWH 60 
DVTllPATFEA VAMMCNDLDS KLSDPNGKPV 120 
GFVGHFETSA KBNIKIDBAS RCLVKHILAH 180 



285 



10 



20 



WO 02/086443 

ATCCAGTGCC CIGCKCAAGA 
GTGGCCGTGG 
TTCGTCTGCG 



CATCATCGGC 
CCTGGRGTCC 
TGGCRTCATA 
TGGAAAAGAA 
AAGAACACAG 
• GGGTGGCRTC 
GAACTGCTTT 
CCCIGAG6AC 
CCCACTKCCC 



AGGGCAAQCT 
GTAOC6CTGC 
GCATCATCTC 
TCCAGGACRC 



AAGAOTQAGC 
CCTCCCCTTC 
ATCCAW3CCXJ 
AQAAAAACAG 
ACAACTTAGC 
AGCATCTCCT 



GGGCTTGTCC 
GGAAAGTTCC 
ATrGTGGQGC 
GAGGATGCGG 
CCCACCTTCA 
ATGGCTTCCT 
CAQACTSTCC 
AAGCAGCACC 
ATCCCAGGGQ 

GCTCAAAGTG 
TGCCAGTCAC 
ATGCTCCCTG 
CATTCTTGTT 



AGATTTATGG 
TCATTTTGRG 
TCTACTGGGA 
AGCCCTCAGG 
CGGTACCCTT 
TGGGGCCCAG 
CCCACXTTGCT 
GTCTGMCTQA 
GCAGAAGTTC 
GGGCTGG6AC 
CACCTATGAG 



TGAGATGATG 
CACCCTGGGT 
ATGTGGCATT 
GACAGCGGCC 
CTTTCAAGGT 
TCTCTTGGTG 
ATTACCATTC 
GATOCCTGAA 
ACCCTCAGTA 
GGTACCCTAG 
AGGAGCTGGT 
GGCTTGGCCT 
GAATGQASAA 
TGCCTCTGCC 



CTCATGCCTT 
TTTGTATACC 
AACAATCTGC 
TCCGCCAGCC 
TTCAA0GGG6 



ATGCCACGCT 
ACGGTGCCCT 
TGACXICCATT 
ATGGCATTCC 
TTGAASTCTA 
ACTGQACCGC 
TGGCCCTGAT 



PCTAJS02/12476 



CCXTOU3TAC 
GCTCCCAGGG 
GCTGACCTAG 
CAAGGTGCCT 



CTGTCTACAT 
CCATGGTQTC 
CAGGTGAGGA 
CTAGGGATGC 
CCACCCCTGC 
CAAAGTGGGC 



CTCcrrrorA aaaaaaaaaa aaaaaraa 



CTCCCAGQAS 
CACCTOGGCC 
TOaCAQCCTC 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
IBGO 
1860 
1920 



25 

30 
35 
40 
45 
50 
55 
60 
65 



I 

MAWmKiRWR 
PSFQDVHVMV 
GVENIiINADF 
SMTIKTFGAY 
YHGDSQHRAA 
IJ4PTCALIIG 
SA8LEVYGKE 
WGQPSDENCF 



UPLTCLIJiQV 
FVQFGFLMTF 
CVASVCVAFG 
FGLTVTRILY 

nrrvcsiAAc 

FVCGIISTLG 
GliVHSFDPQG 
EDAVYWEMPB 



21 

I 

IMVILEGVFV 
LORYOFSAVG 
AVLQKVSPIQ 
RRNLEQSKER 
VLTSVAISSA 
FVYLTPFLES 



I 



FNPLLAAFGI 
LMMTFFQVT 
QNSVYQSDLP 
UIKKGKLOMV 
RLHIQiyrCGI 
GKFQIYGLItV 
GirSTVYIPED FTFKP86FSV 



QNAIiU4QGWF HFLQDRYIW 
LFAVIIBFII.Ii NIiUCVKDAGG 
AMIOTLFLHM YWPSFNSAIS 
HIQHATLAGG VAVGTAAEMM 
NKLKGIPGII GGIVGAVTAA 
TLAMAIjKGGI IVGLILRLPF 
PSVPMVSPI.P HASSVPLVP 



Coding sequence > 75 . . 



GC31TATATCC 
CIACTQATCT 
TATACAAGTG 
tcctggaaag 
CCAGAGAAAA 



ATCTOAAACC 
TGAGCTCaVTA 
TTCaGTTOVG 
ATGGCAGTTT 
GTCTCAGAAA 
GACATTTCTO 



TGSTGQCCTA CAAAATTOCT G 
TAATTTTQAA 
TTGGTTAATT 
CCTTTATTTT 
CATTGTTCAA 
GATAGTAACT 



21 
I 

' CTGTGGTTGC 
CXQCAGCTCT 
TTCTTCTCAT 
TTTACTCGAG 
AAAXACCTAA 
AAACTGGTTQ 
GATATTGAGT 
GCTATCXaiGa 
CCACTGrrOQ 
GTACCTGRAA 
CTTC6TTCAT 



I 



CCOGGQAGCA 
TCGGCATCAA 
TCCAGAAATA 
ATAATGTGGT 
TAGTTATCTC 
GTGACAAGAC 
ATGAAATCCG 
AAGTTXCTT6 



GGGA ATCAC C 
CAGCATTTTA 
CGQACTCAOC 
GGAACAACTG 
AAATATTGAA 



rrCACIACTAC AATCC3VCAAA 



AAC3WTOAAA 



A-EATTTGTAC 
TTATAAAATC 
AAAAAAAAAA 



TTTACATGGA 
TTTGGTACCT 
AAGGAACCAG 
GTAGATGGAA 
AGTAGTTTGA 
TAAATATTCA 
TATTGCTGTA 
TTCCAATTAT 
TGTTTAATGT 
AAGTTTTAAG 



ATT TGACT TA 
GAGGTmTT 
AAACTTGTGC 
CTCAGTATAG 
GAATCTTTGT 
TAGCTCCTTT 
TTGACTTTAA 
TCTGTGATAC 
TGAAAGTGAG 



TOTCATCTAT 
ATGATACTTA 
CCATGGAGTT 
TGTCAACATT 
TATAAAGCTA 
GTAGGGAGAT 
TAAGGTCCTG 
TGACCTTCAT 
TTTATGTAAC 
AGAACrCTTA 
GAAATAAACT 



AGTT6ATATG 
CTGAACTGTG 
AACATCATQA 
GTQATGTATA 
GATCCTTTCX: 
ATTTAAGTAT 
AAAGTAACTC 
TTCATGTATA 
TTGAACCTAT 
AAA ATGT TTT 
TAAGTTTGTT 



CTGCGCGGGA 
TATCAGOGTG 
TTGCTTGTAA 
AAAGATTGGT 
AGTGGTGAGG 
GACAGTGCAC 
AGACAGATCA 
.CTGCTGATTT 
CAGTTTATTA 
GTAAATAGCA 
AATGTAATTG 
TTTTATTTCA 
TGTAATTOTT ' 
ATT TATTG CA 
TTCCTTTQAA 
TAAATCAGAT 
AAAATACAAC 
ATAATCTATA 
GTTTTCCXrrA 
OAAGCAATGG 
TTCATGTGTT 
TTAAAAAAAA 



1200 
1260 
1320 
1380 



70 
75 



I I I I I I 

MALQLSREQQ ITLRGSAEIV ABFFSFGINS IIiYQRGIYPS BTPTRVQKYG I.Tia.VTTDI.E 
LIKYUmWE QDXDWLYKC8 VQKLVWISN IBSQEVLBRW QFDIECDKTA XDDSAPREKS 
QKAIQDEXRS VIRQITATVT FLPLIiBVSC3 FDLLIYTDKD LWPEKWKBS QPQFITHSBE 
VSUtSFTTTI HKWMSMVAYK IPVMD 

Seq ID NOi 260 SNA sequence 
Nucleic Acid Accession »: NM_001211 
! 43.. 3195 

11 21 



I I I 

AAAGGCCTGC AGCAGGAOGA GCSACCTQA6C ~ 
GAAGGGGGTG CTCTQAaTOA AGCCATGICC 
OAAAATGTAC AACCTTTAAO GCAAGQGOGQ 
CAAGAATCrO CCTGTAftCAA TACTCTTCAG 



ATGAATGOGA ACTGAGTAAA 
CGCrrCAGGG AGCACTGGCA 
GQGCATTTaA ATATOAAATT 



286 



wo 02/086443 

OGATTTTACA CTGGAAATGA CCCICTGOAT GTTTQGGATA GGTATATCAG CTOGACAGAG 300 

CAGAACTATC CTCAAOGTGG GAAAGAGAGT AATATGTCAA CBTTATTAGA AASAOCiaTA 360 

GAAOCACTAC AAOSAOAAAA ACGATATTAT AGT3AT0CTC QATrTCTCRA TCTCTGGCTT 420 

AAAXTAGGGC GTTTATGCAA TGAGCCTTTG GATATGTACA GTTACTTGCA CAACCaVAGGO 480 

ATTGGTGTTT CACTTGCTCA GTTCIATATC TCATGGGCAG AAGAATATGA AGCTAGAGAA 540 

AACTTTAGGA AAGCAGATGC GATATTTCAG GAAGGGATTC AACAGAAGGC TGAACCACTA €00 

GAAAGACTAC AGTCCCAGCA CCGACAATTC CAAGCTOGAG TGTCTCGGCA AACTCTGTTG 660 

GCACTTGAGA AAGAAGAAQA GQAGGAAGTT TTTGAGTCTT CTGTACCACA AOGAAGCACA 720 

CTAGCTGAAC TAAAGAGCAA AGGGAAAAAG ACAGCAAGAG CTCCAATCAT CXXSTGTAGGA 780 

QGTGCTCrCA AGGCTCCAAG CCAGAACAGA GGACTCCAAA ATCCATTTCC TCAACaGATG 840 

CAAAATAATA GTAQAATTAC TGrmrrTOAT GAAAATGCTG ATeSAGGCTTC TACRGCAGAQ 900 

TTCTCIAAGC CTACRGTCCA GCCATGGAXA OCACCCCCCA TGCCCA(3GGC CAAAGAGAAT 960 

GAGCTGCAAO CAtSBCXXTCTG GAACACAGGC AGGTCCTTGG AACaC3W3GCC TGGTGGCAAT 1020 

ACAGCTTCAC TGATAGCTGT ACCCGCTGTG CTTCCCAGTT TCACTCCATA TGTGGAAGAG 1080 

ACTGCACAAC AGCCAGTTAT GACACCATGT AAAATTGAAC CTAGTATAAA CCACATCCTA 1140 

AGCACCAGAA AGCCTGGAAA GGAAGAAGGA GATCCTCTAC AAAGGGTTCA GAGCCATCAG 1200 

CAAGCGTCTG AGGAGAASAA AGAGAAGATG ATGTATTGTA AGGAGAAGAT TTATGCAGGA 1260 

GTAQaaOAAT TCTCCTTTGA AGAAATTOGG GCTGAAGTTT TCCGGAAGAA ATTAAAAGAG 1320 

CAAAGGGAAG CCQAGCTATT GACCAOTGCA aAGAAGAOAO CAGAAATGCA GAAACAGATT 1380 

GAAGAGA'TOQ AGAAGAAlGCT AAAA6AAATC CAAACTACTC AOCAAGAAAG AACAGOtOAT 1440 

CAOCAAGAAG AOACGATGCC TACAAAGGAG ACAACTAAAC TCCAAATTGC TTCOGAGTCT 1500 

CAGAAAATAC CAGGAATGAC TCTATCCAQT TCTGTTTOrrC AAOTAAACTO TTQTGCCAGA 1560 

GAAACTTCAC TTGCGGAGAA CATTTGGCAG GAACAACCTC ATTCTAAAGG TCCCAGTGTA 1620 

CCTTTCTCCA TTTTTQATOA GTTTCTTCTT TC3«3AAAAGA AGAATAAAAG TCCTCCTGCA 1680 

GATCCCCCAC GAGTTTTAGC TCAACGAAGA CCCCTTGCAG TTCTCAAAAC CTCAGAAAGC 1740 

ATCACCTCAA ATGAAGATGT GTCTCCAGAT GTTTGTGATG AATTTACAGG AATTGAACCC 1800 

TTGAGCGAGG ATGCCATTAT CACAGGCTTC AGAAATGTAA CAATTTGTCC TAACXXAQAA 1860 

GACACTTGTG ACTTTQCCAG AGCAGCTCGT TTTOTATCCA CTCCTTTTCA TGAGATAATG 1920 

TCCTTOAAGO ATCTCCCTTC TGATCCTQAG AGACTGTTAC CGGAASAAGA TCTAGATGTA 1980 

AAOACCTCTG AGGACCAGCA GACRaCTTOT GGCACTATCT ACAGTCAGAC TCTCAGCATC 2040 

AAGAAGCTOA GCCCAATTAT TGAAGACAGT OGTGAAGCCA CACACTCCTC TGGCTTCTCT 2100 

GQTTCTTCTQ CCTCGGTTGC AAGCACCTCC TCCATCAAAT GTCTTCAAAT TCCTGAGAAA 2160 

CTAQAACTTA CTAATGAGAC TTCAGAAAAC CCTACTCAGT caCCATGGTG TTCACAGTAT 2220 

CGCAGACAGC TACTGAAGTC CCTACCAGAG TTAAGTGCCT CTGCAGAGTT GTGTATAGAA 2280 

GACAGACCAA TGCCTAAGTT GGAAATTGAG AAGGAAATTG AATTAGGTAA TGAGGATTAC 2 340 

TGCATTAAAC GAGAATACCT AATATGTGAA GATTACAAGT TATTCTGGGT GGCGCCAAGA 2400 

AACTCTGCAG AATTAACAGT AATAAAGGTA TCTTCTCAAC CTGTCCCATG GGACTTTTAT 2460 

ATCAACCTCA AGTTAAAGOA ACQTTTAAAT GAAGATTTTG ATCATTTTTG CAGCTGTTAT 2520 

CAATATCAAG ATGGCTGTAT TQTTTGGCAC CAATATATAA ACTGCTTCAC CCTTCAQGAT 2580 

CTTCTCCAAC ACSKOTOAATA TATTACCCAT GAAATAACAG TGTTGATTAT TTATAACCTT 2640 

TTGACAATAQ TGGAGATOCT ACACAAAGCA GAAATAGTCC ATGQTGACTT GAGTCCAAGG 2700 

TGTCTGATTC TCAGAAACAG AATCCACGAT CCCTATGATT QTAACAAGAA CA ATCAAG CT 2760 

TTQAAGATAG TGGACTTTTC CTACAGTGTT GACCTTAGGG TGCAGCTGGA TGTTTTTACC 2820 

CTCAGCGGCT TTCGGACTGT ACAGATCXTTG GAAGGACAAA AGATCCTGGC TAACTGTTCT 2880 

TCTCCCTACC AGGTAGACCT GTTTGGTATA GCAGATTTAG CACATTTACT ATTGTTCAAG 2940 

GAACACCTAC AGGTCTTCTG GGATGGGTCC TTCTGGAAAC TTACCCAAAA TATTTCTGAG 3000 

CTAAAAGATG GTGAATTGTG GAATAAATTC TTTGTGCGGA TTCIG AATOC CAATGATQAG 3 0 60 

GCCACAGTGT CTGTTCTTGG GGAGCTTGCA GCAGAAATGA ATGGGOTTTT TGACACTACA 3120 

TTCCAAAGTC ACCTGAACAA AGCCTTATGG AAGGTAGOGA AGTTAACTAO TCCTGGGGCT 3180 

TTGCTCTTTC AGT6AGCXAG GCAATCAAGT CTCACAGATT GCTGCCTCAQ AGCAATGGTT 3240 

GTATTGTGGA ACACTQAAAC TGTATGTGCT GTAAT TTAA T TTAGGACACA TTTAGATGC» 3300 

CXACCATTGC TGTTCTACTT TTTGGTACAG GTATATTTTG ACSTCACIGA TATTTTTTAT 3360 

ACAGTGATAT ACTTACTCAT GGOCTTOTCr AACTTTTGTG AAGAACTATT TTATTCTAAA 3420 

CAOACTCATT ACAAATGGTT ACCTTGTTAT TTAAOCCATT TGTCTCTACT TTTCCCTGTA 3480 

CTTTTCCCAT TTCTAATTIG TAAAATGTTC TCTTATGATC ACCATOTATT TTOTAAATRA 3540 
TAAAATAGTA TCTGTTAAAA AAAAAAAAAA AAAAAAAAAA AAA 



1 11 21 31 41 SI 

i t I I I < 

MAAVRKEGGA LSEAMSLEGD BHEIiSKENVQ PLRQ6RIHST LQQALAQBSA CMNTLQQQKR 
AFBYEIRFYT GNDPLDVWDR YISVITEQHYP QGGKESNMST LLERAVEALQ GEKRYYSDPR 
FLNliWIiKLaR LCNBPLDMYS YUOJQGIGVS lAQPYISMAE EYEARENFRK ADAIFQEXJIQ 
QKAEPIiERIiQ SQHRQFQARV SROTLLALEK EEBEEVFBSS VPQRSTLAEL KSKGKKTAEA 
PIIRVGQAIjK APSQNRGWJN pfpqqmqnns ritvfdbmad EASTAELSKP tvqpwiappm 
BRAKEKELQA QPWNTaRSI.E HRPRGNTASL lAVPAVIiPSF TPYVEBTAQQ PVMTPCKIEP 
SIHHZX.STKK FGKBEGOPLQ RVQSHQQASB EKKBXMMYCK EKIYAGVGBF SFEEIRAEVF 
RKKXiKEQHBA BIiLTSABKRA EKQKQIEEMB KKLKBIOTTQ QEHTGDQQBE TMPTKETTKIi 
QXASESQKZP GMTIiSSSVOQ VNCC3UlETSIi AEHIKQBQPH SK6PSVFFSI FDEPIXSEKK 
NKSPPADPPR VLAQRRPIAV LKTSESITSN EDVSPDVCDE FTaiSPI.SED AIITGPRNVT 
ICPNPBDTCD FARAARFVST PPHEIMSLKD LPSDPERLIjP EEDLDVKTSE DQQTACGTIY 
SQTIiSIKKI.S PIIEDSREAT HSSGFSGSSA SVASTSSIKC WJIPEKLELT NETSENPTQS 
FWCSQYRRQIi LKSIiPEIiSAS ABLCIEDRPM PKLEIBKEIE LCaiEDYCIKR BYLICBDYKL 
FHVAPRNSAB LTVIJWSSQP VPWDFYINLK IJCERLUEDFD HFCSCYQYQD GdVWHQYIH 
CBTUaOhUlB SBYITHEITV LIIYNliLTIV EMMHCABIVH ODLSPRCLIL RNRIHDPYDC 
NXmiQALKIV DFSYSVDIiRV QLDVFTI.SOF RTVQIIjEGQK IIANCSSPYQ VDIiFOIADlA 
HUJ.FKEHLQ VPHDGSPWKL SOHISELXDO ELHKKPFVRI LNANDEATVS VU3ELAAB0I 
6VFOTTFQSH LMKAUIKyGK LTSPGALLFQ 



Seq ID NO: 262 DHA sequence 
Nucleic Acid Accession K: NM_0037a4 
Coding sequencer 365 -.1507 
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wo 02/086443 
I I 

OTCTACrTAT CAATAAGC3\G 
T&AAACraAA TTCTCAGAAT 
ATTAGGTAGT GSTAAAACAG 
TQTACAGGOA AGCTCTCCTT 
CAACAGTCTT GAGAAGTGTG 
AAGGAAACCA GATTCCCATC 
TGCAATC3GCC TCCXTTTGCTG 
GGATGACAAT CAAGGAAATG 
GGCCCTGGTC CGCTTGGGCG 
TGTTAACACT GCCPCAGQAT 
ACTGAAAAGA 



CTGCCTGTGC 
TTTASAACAA 
GCTCCCTTCG 
CATCACCTTC 
GAAAOVTTTT 
ACTGCTTCTG 
CAGCAAATGC 
GAAATGTGTT 
CTCAAGATGA 
ATGGAAACTC 
ATATAAATGC 



I 

' AGAGTGCAGG 
ATTTTTGTCT 
AAGCTCTCCT 
CTAAGTGCAT 
CTTTGTGAGT 
GGTATCAQAT 
AGAGTTTTGC 
CTTTTCXrrCT 



TCATCACCTT 
GGGGGAAAAT 
GAGAACAGAT 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



GAATGGGCTT TTTGCTGAAA 
AAAATTATAC GATGCCAAAO 
ACGTAATATT RATAAGTGGG 
TGAAGGTGGC ATAAGCTCaT 
CAAGTGGCAA TCAGCCTTCA 
GrGCTCTOGO AAGGCAGTCQ 
TGAGGACCCA TCAATGAAOA 
TCTGCTGCCT GASAATGACC 
GGAATOQACC AATCCAAGGC 
CAAGAXAQAG AAGAATTATG 
CTTTGATGAA TCCAAAGCAG 
AAGQATGATG CACAAATCTT 
CaCAGGAAGT AATATTGTAG 
COCATTCXTTA TTTGTTATCA 
CCCTTGAAAA TCCAATTGGT 
AAQTCAATAG ATYTGRGtXT 



TCTTTCTTCC 
R6TGCTCAAC 
AGTTATCGCA 
CTAGAAATAA 
CTTCATATGT 
ATTAGGGACC 
TGATAAGACA 



AGAATAAAGA 



TGCTCTTAGC 
CAOGCICATT 
TGGTAAGGAQ 
GATATTCTAG 
GTGTGAAGGA 
TTGATTTTAA 
TACCAAAATA 
ATATQTACAT 
QAATCATATT 
AATACAACAT 



AAGTGTATGG 
TGGAGCGAGT 
TTGAAAATGA 
CTGCTGTAAT 
CCAAGAGCGA 
CCATGATGCA 
TTCTTGAGCT 
TCTCTGAAAT 
GAATGACCTC 
AAATGAAACA 
ATCTCTCTGG 
ACATAGAGGT 
AAAAGCAACT 
GGAA6GATGA 
TTCTGTTATA 
AAT1GGAAAA 
ACACTGQTGA 
ATTCTACCAC 
TCTATCATTC 
AACGTAGAAG 
CTTCATTGTA 
TAAATTTTCT 
ATCAGTGTAT 
TTTCATTAAT 
GTTTTTTCAA 
TCATATGCTO 



TTCTAATAGT 
ATCCCACAAG 
CTTTCATAAG 
TGACTTTACG 
AACACATGGC 
GGTGCTGGTG 
AACCATAAAT 
TCAGGAACGG 
CAGATACAAT 



TTCAACCTGT 
CTGAGCCTCT 
CAGATTGATA 
CAGTCAGGGC 
QATTATGATC 
GACTACATTG 
AATCATTTAG 
AAAATCAAGA 
AATGCTGTGT 
TGCCATTTCA 
AAGTTCAATT 



TAAGTATGTT 
ATATTTGAGA 
GATTGCTTCG 
CACTGAGGAG 
CCCTCAGTCC 
CATCATCTTA 
GCAGTCCCCA 
ATGTGGTGTT 
CTTGACCCTT 
CATGTGTCTC 
TCCX:CCATGA 



CTGACCTTTC 
GAGGTATTTT 
GCCCTAGGGC 
GGGGGTCGTC 



I 

GGACAGCCTT 
ACTTTGGTTC 
CCTAAGTGCA 
ACCTAGGGCT 
CACCTAGAGA 
C3VCTCCATTT 
TCAGAGAGAT 
TCX3CTGCCCT 
AGTTGCTTCA 
TCCAGTCTCA 
TCAGCATTGT 
AGTGTGCCGA 
AAGACACTA6 
ACX3TGATTQG 
ACTTCAAAGG 
AATCTCCCAA 
TGTCTGTTAT 
ACATGTACGT 
AGAATCTAAT 
TTCCTCAGTT 
TGAAAGATAT 
TGTATATATC 
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AGCAATCTAO 
TTGTTGACCT 
AATCTASATG 
GCTTTCAATT 
ATATTAAAGA 
TGTAGTTTAT 



ACOCTOTTTA 
TTCAGTGGCA 
CMCATCAAA 
TCCTTTGAGT 
CCTAGACACC 
ACCCATTTCT 
CCCGTCTGGA 
GATCCTTTTT 
QAAATAAOCC 
AIGAAGATTT 
GIAAAAAA3G 
OACAAATTTT 
TCTTTTAACT 
AAOTTTTTCC 



AAGTTTCTTG 
GRACCACCAC 
TTATTTCTTC 
TGGTTGATTG 
AATTTCATTG 
AATTATGGAa 
GAAACTCTAC 



TAOABTTTAC 



1020 
1080 
1140 



1380 
1440 
1500 
1560 



1740 
1800 
1860 
1930 
1980 
2040 
2100 
2160 



11 



31 
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MASLAAAMAE FCFNLFREMD DNQGMGHVPP SSI.SI.PAAliA IjVRLGAQDDS LSQIDKLIBV 
NTASGYraiSS NSQSGLQSQL KRVFSDINAS HKDYDI.SIVN GLFAEKVYGP HKDYIECAEK 
LYDAKVERVD FTIIHI.EDTRR KINKWVENBT HGKIKNVIGB GGISSSAVMV LVNAVYFKGK 
WQSAPTKSBT INCHFKSPKC SGKAVAMMHQ BHKPlir.SVlE DPSMKILELH YNGGINMWI, 
LPEMDLSEIE HKLTFQNLMB (flMPBFMXSK XVBVFPPQFK lEgWYEMKQY IJ»ALGWa3IP 
DBSKADLSQI ASGGRLYISR MMHKSYIEVT EB6TBATAAT GSNIVEKQIiP QSTLFRADHP 
FLFVIKKDDI ILPSGKVSCP 

Seq ID NOs 264 DNA sequence 
Nucleic Acid Accession »i ABOS2906 
74-814 



AAAACCTTGA GGIGATTCAT C 



TCCGGCTGGT 
ATCCCTAAGT 
ACTTTTCTTC 
AAACTAAATG 
ATACTTACAG 
ACXXTTGCAGG 
CAGTTCAGTT 



GCrC!CTGCTG 
CATCACCGTC 
GGATGAAAAG 
CCTGGGGAAO 
GGTGGTGGAC 
GGAACCCCTC 
TGGATCTTGG 
AATGTSGACA 
GGTTGTGGCC ATGTCCTTCC A 
CTTCTTQATG GGCATGGACA G 
ACCCAACTCA 
TGCTTCATCC 
CAAAAGGCTC 
AOGACCTACG 
AGCTCATTCA 
ATTATGCAAT 



TCACTCCTGG 
ACTATGACTG 
TCACAAOSOC 
AGCAACTGCG 
CCAGGATGTC 
TOGATGGGCA 



ACCACGGTGG 
TGGCAACAAG 
CTGGAAAGCa 
TGAC3VTTCAG 



GATCTTCCTC 
AAAGATGAAA 
AATGGGAGAC 
GCCAAGTGCA 
CACCACCCrC 



41 
I 

CAAGTCTCTC 
CTTCTGTGCC 
CCTCACTCTC 
TCTGCGGTTC 
ACAGTCACAC 
CAGAACCCAG 
CTGQAGAATT 
AAAGCTGAAG 
CTCTTTGACT 



CTCCCTAGCG 
TCCCGCTTCt 
TTTGCTATGA 
AAGOCCAGGT 
CTGTCAGTCC 



CATCCTCCCC 
AAGCTGATAC 
CCAGCTGCCC 
TGGACCCAAT 
TACCXAACAT 



GTACTICTTT 
TRGACTTCAG 
ATAAGAAAAA 
TTTAAATAAA 



GAATGAT6AT 

ACCTCTGQGG 
ATTTATATTA 
GAGTTCTATT 



ACACACCCAA 420 
GACACAGCAG 480 
CAGAQAAOAG 540 
AGAATGACAA 600 
GGCTTGAGGA 660 
TOGCCATGTC 720 
GCCTCCTCAT 780 
TGACAGGITA 840 
: GGTCITGATC AAACCOGCCC TTCIGTCTGG 900 
960 
1020 
1080 
1140 
1200 



TGTATAGGAT 
GOAGCACCAC 
ATCCTTTGCT 



CTGCCTTGAT TCCTTTTGCC AACAATTTTA 
TTTCrCTTGG TGCTACCTGA TGGAATTCCT 
TATATCATTT TCTTTCTTCT CTTTTTGTTT 
CTCTTTCTTG C31AATGATAT TGTCAGTAAA 
ATTCTTTCCG TGTCCTGAAA GAGAATTTTT 
ATGATTGTTT CCTTTAGTAA TTTATTGTTC 
TCCC3VAAAAA AAAAAAAAAA A 



CCAGCAGTTA 
GCACTTAAAG 
GGAAAATCAA 
ATAATCACGT 
AAATTATTTA 
TGTACIGATA 
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20 



35 



45 



75 
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1 11 21 31 41 51 

I I I I I I 

MAAAAATKIL LCLPLLLIiS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 60 
PUJYDCXSNKT VTPVSPIiGKK UJVTTAWKAQ NPVUIEWDI LTEQIiRDIQL ENYTPKEPLT 120 
LQARMSCBQK AEGHSSGSWQ FSFDGQIFLL FDSEKRMWTT VKPGARKMKE KHENDKWAM 180 
SPHYFSMGDC ICWLEDFLMQ MDSTLBPSAQ AFIAMSSGTT QUtATATTLI LCCIAIILFC 240 
FILPGI 

Seq ID ^ 
Nucleic 

Coding sequence: 127-444 

1 H 21 31 41 SI 

ATTGATGATA TATTTAACGA AATCAAATTT GGTGAATATG TGGAC3\CTGG AA AGCTAATC 
GACAAGATCA ACTTACCAGA TTTCCTAAAA GTGTACCTTA ACCACAAGCC ACCTTTTGGT 
AACACCATGA GTGGCATCCA CAAGAGCTTT GAGGTGCTCO GTTATACCAA CTCCAAAGGQ 
AAAAAGGCCA TTCGAAGAGA GGACTTCCTG AGACTGCTCG TTACTAAAGG TGAGCATATG 
ACGGAGGAGG AGATGTTGGA TTGCTTTGCT TCACTGTTTG GCCTGAATCC CGAGGGATGG 
AAATCCGAGC CTGCAACCTG CTCOGTCAAA GGTTCAGAAA TTTGCCTTGA AGAAGAACTT 
CCAGACQAAA TCACTGCAGA AATATTCGCXJ ACTGAAATTC TTGGCTTAAC CATTTCAGAA 
GATTCCGGCC AGGATGGTCA GTGAAGTTAC CAGGAATGTT TAAAGCa.CAA AGGACTTTGG 
GTGTGTGTGC ATGCACATGT GTGTGrrTTC - CATOAGGCAC TGCTTTTTAT GCATTTCCCT 
CCCCCCTCTC ATCTTTAGAA CATTTAGACA TTAAAQCAAQ TTTCTGGTGA GCAATG 



ilSGIHKSFBV LQYTNSKGKK AIHREDFLRL LVTK6EHMTE EEMLDCFASL FGLNPEGWKS 
BPATCSVKGS EICLBEELPD EITAEIFATE IliQLTISEDS GQOGQ 

Seq ID MO: 26B DNA seg[uence 
Nucleic Acid Accession 1^s I1M_001898 
Coding sequence: 57-482 

1 11 21 31 41 51 

GGCTCTCACC CTCCTCTCCT GCAGCTCC&G CTTTGTGCTC TGCCTCTGAG GAGAOCATGG 
OCCAQTATCT GASTACCCTG CTGCTCCTGC TGGCCACCCT AGCTGTGOCC CTGGCCTGGA 
GCOCCAAGGA GGAQGATAQG ATAATCCOQQ GTGOCATCTA TAAOGCAGhC CTCAATGATG 
AGIGGGTACA GOGTGCCCTT CSMrrTCQCXa TGAGOQASTA TAACAAGGCC ACCAAAGATG 
ACTACTACAG ACGTOCGCTQ OGGQTACTAA QAGCCAGGCA ACAGACCGTT GGGGGGGTGA 
ATTACTTCTT CGACGXAGAG GTGGGCCGCA CCATATGTAC CAAGTCCCAG CCCAACTTGG 
ACACCTGTGC CTTCCATGAA CAGCCAGAAC TGCAGAAGAA ACAGTTGTGC TCTTTCGAGA 
TCTACGAAGT TCCCTGGGAG AACAGAAGGT CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 
AGGGATCTGT GCCAGGCCAT TCGCACCAGC CACCACCCAC TCCCACCCCC TGTAGTGCTC 
CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 
GACAGACAGA OAAGQCTGCA GGAGTCCTTT GTTGCTCAGC AGGGCGCTCT GCCCTCCCTC 
CTTCCTTCTT GCTTCTAATA OCCCTGOTAC ATGGTACACA CCCCCCCACC TCCTGCAATT 
AAACAGTAGC ATCGCC 



JnaYLSTLIiL IlATLAVAIA HSPKBEDRII PGGIYNADtN DEWVQRAIiKF AISEYNKRTK 
DDYYRRPLRV LSARQQTVaG VMYPFDVEVG RTICTXSQPN IiDTCAFHEQP ELQKXQLCSP 
BIYEVFHBNR RSI.VKSRCQB 8 

Seq lO MO: 270 DMA sequence 
MUeleic Acid Accession #■ XM_093210 
: 13-1854 



ATGQCAAQCG CCQGAATCTC CTCAGCTGCC GXTTCACAAA AGAGGTACCA GGTCCGCACC 
AAAI3QAGCAC ACAAQCAGCA CCAGGAGCTG CAGAAGAAGG AGGOGGCAGC GATGGACCAG 
GGCAGAGGGA ATGQGGAGGG GGCATCCTAC CCCATATCTG AGGTGCGACT GCGGGACGTA 
GAGCGGACTG GGCCTTTCCC GTTGGCGCGT GGCCTCAATC AGGACTTCTT GCCCAGGTGC 
GCCTTCAAAA CGGTAAOAGC TGCAACTGAA CGTGTGAGAC ATGGTGCAGA TAGGCTGAQA 
GGCGGCGGGA GAGATGCCCA TGAACTCAAG TACCCGGACA OGCCCTCCAC TTCIACCRCC 
ACQAGTAACA CCGCCCCCAC GGGACCGCXC TCGAGGTCCC CCAAOCCAAa GACGCMOGA 
GGAACGCCCC GGCGCGCGGC CAaCAflCGQC GGGCACCOGC CCAATGGCCA OGGAACTCAQ 
CACTGGCAGT CGGCCCTCCT CACACOGCAG GCGTGCAGTQ TGGCCGACGQ AGCCTOCOGG 
BCCGAGGACC CAGCTAGGCC GTCACCCCGG TTQCTCCCAC GGGAAGGGGC ACCAGGCAAA 
CTGCCCAAGG CCCOGAGCCC AGGCTCOCTG QOGOAGGCCT CCGCIGGTOC CQCCCRGAIC 
ATGGCOGCCA CCAGGCTOCC GAOCCKTGGC TTCCTGTCOG GGAACGGCCC GGOSTOCTGS 
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CCCGGGCCTG GAGCCCCACA CCCGAGGGTQ CAGACTGGCT GCCAAOQCCA. CACTTTTGGC 600 

TAJVAAGAGGC ACTGCCAGGT 6TACAGTCCT GQGCATGGGC ■IXJTT'XGAGCT TCGOGOGAGh 660 

GCCCAGCACT GGTCCCOGGA AAGGTGCCTA QAAQAACAAG GTGCAGGACC CCGTGCTGCC 120 

TCAACAGGAG GGTGGGGGAA CAGCTCAACA ATGGCTGATG GGOGCTCCTG GTGTTQATAQ 780 

5 AGATGGAACT TGQACTTGGA GGCCTCTCCA CGCTGTCCCA CTGCCCCTGG CCTAGGOGGC 840 

AGCCTGCCCT GTGGCCCACC CTGGCCGCTC TGGCTCTQCT QAGCAGCGTC GCAGAGGCCT 900 

CCCTGGGCTC CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GTCCTGGCGT 960 

CCCCGGCCGG CCACCTGCCG GGGGGACGCA CGGCaZGCTO GTGCAGTGGA AGAGCCCGGC 1020 

GGCCGCCGCC GCAGCCTTCT CGGCCCGCKC CCCXX5CXX3CC TGCACCCCX». TCTGCTCTTC 1080 

10 CXXXCGGGG6 CXX3CGCGGC30 CGGGCTGGGG GCCCOGGCAG CCGOGCTOGG GCAGaQQGGO 1140 

OGCGGGGCTG CCGCCTQCGC TOSCAGCIGG TGCCGGTGCB CGOGCTCGGC CTGGqCCRCC 1300 

GCrCCGACGA GCTGQTGOGT TTCCGCTTCT GCAGGGGCTC CTGGCGCOGC GCGOGCTCTC 1260 

CACACGACCT CAGCCTGQCC AGCCTACTGG GCGCCGGGGC CCTGCGACCG CCCC0GG6CT 1320 

CCXMGCCOGT CAGCCAGCCC TGCTGCCGAC CCACGCGCTA OJAAGCXSGTC TCCTTCATGG 1380 

15 ACX5TCAACAO CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTQCCTGG 1440 

GCTGAGGGCT CGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG ISOO 

GGACCCTCCC GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACX3AAG GCXTTCAAAGC 1560 

TGAGAGGCCC CTACCGGTGG GTGATGGATA TCATCCCCQA ACAQGTGAAG GGACAACTGA 1620 

CTAGCAGCXX CAGAGCCCTC ACCCTGCSGOA TCCCAGCCTA AAAGACACCA GAGACCTCAG 1680 

20 CTATGGftOCC CTTCGGACCC ACTTCTCACA GACTCTaGCA CTGGCCAGGC CTCCaACCTG 1740 

GGACCCCTCC TCTGATGAAC ACTACAGTOG CrGAGGCATC AGCCCCCQCC CASGCCCTGT 1800 

AGGOACAGCA TTTGAAGOAC ACATATTGCA GTTGCTTGGr TGAAAGTGOC TQTGCTGGAA 1860 

CTGGCCTOTA CTCACTCATG GGAGCTGGCC CC 

25 

\ r r r i i 

HELGLGGLST LSHCPHPHRQ PAI.NPTI<AAI< Ai;LSSVAEAS LQSAPtlSPAP RBGPPPVIiAS 
30 PACatLPOGHT ARHCSOBAHR PPPQPSRPAP PPPAPPSALP HGGHAARAGG PGSHAKAAOA 
BGCRUtSQIiV PVRAL6LGHR SOELVRFRFC SGSCRRARSP EDIiSLASLLG AGAUIPPFGS 
RPVSQPCCRP TRYEAVSFMD VKSTWRTVDR LSATACGCLO 

35 



70 



1 11 21 31 41 SI 
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40 ATGCCCGGCC TOATCTCAGC CCX5AGGACAG CCCCTCCTTG AGGTCCTTCC TCCCCAAGCC 60 

CACCTGGGTG CCCTCTTTCT CCCTGAGGCT CCACTTGGTC TCTCCGCGCA GCCTGCCCTG 120 

TGGCCCACCC TGGCCGCTCT GGCTCTGCTG AGCAGOGTOS CAQAGGCCTC CCTGGGCTCC 180 

QCGCCCCGCA GCCCTGCCCC CCGCGAAGGC CCXKOOCCTB TCJCTGGCGTC CCCCGCOGGC 240 

CACCTGCCGG GGGGACGCAC GGCCCGCTGG TGCAGIGOAA GAGCCOSGOG GCXX3C0SCCG 300 

45 CAGCCTTCTC GGCCCGGGCC CCCGCCGCCT GC3VCCCCCAI CTGCTCTTCC CCQCGGGGGC 360 

CGCGOSQCXSC GGGCTGGGGG CCCGGGCAGC CGCQCTCGGG CAGOGGGGGC GCGQ6GCTGC 420 

CGCCTGCX3CT CGCAGCTGGT GCCGGTGCGC GCGCTCGGCC TGGGCCACCG CTCCGAOSAG 480 

CTGGTGCX3TT TCCGCTTCTG CAGCGGCTCC TGCCGCCSJOS CGCGCTCTCC ACACGACCTC 540 

- AGCCTGGCXA GCCTACTGGG CGCCGGGGCC CTGCGACCXJC CCCCGGGCTC CCGGCCCGTC 600 

50 AGCCAGCCCT GCTGCCGACC CAOGCGCTAC GAAGCGGTCT CCTTCATGGA CGTCAACAGC 660 

ACCTOOAaAA CCGTGGACCG CCTCTCCGCC ACCXSCCTGOQ GCTGCCTGGG CXGAGGQCTC 720 

GCTCCAGGGC TTTGCAGACT GGACCCTTAC OQQTGGCTCT TCCTGCCTGG GACCCTCCOG 780 

CAGACTCCX» CTAGCCAiGCG GCCTCAGCCA GGGAOOAAGG CCTCAAAGCT GAGAGGCCCC 840 

TACOGGTQGG TGATGOATAT CATCCCCGAA CAGGTOAAGG GACAACTGAC TAGCJWKXXX 900 

55 AGAGCCCTCA CCXTTGCGGAT CCCAGCCTAA AAQACACCAS ASACCTCAGC TATGQAGCCX: 960 

TTCGGACCCA CTTCTCACAG ACTCTGGCAC TGGCCAGGCC TCGAACCTGQ QACCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GOCCXXX5CCC AGGCCCTGTA GGGACA6CAT 1080 

TTGAAGGACA CATATTGCAG TTGCTTGQTT OAAAGTOCCT GTGCTGGAAC TGGtXTOTAC 1140 
TCACTCATGG GAGCTQQCCC C 

60 



65 I I I .1 \ I 

MPQIiISARGQ PLI.EVUPQA HLGALFLPEA PLGLSAQPAL WPTLAAIiALL SSVAEASI.GS 
APRSPAPKEG PPPVLASPAG HLPGGRTARW CSGHARHPPP QPSRPAPPPP APPSALPHGG 
RAARAGGPGS RASAAOARGC RLRSQLVPVR ALGLQHRSOE LVRFRFCSGS CRRARSPHDI. 
SUVSLLOAlOA LRPPPGSRFV SQPCCRPTKY EAVaPMDVMS TWRTVDRLSA TAGGCIA 



Seq ID NO: 280 UNA sequence 

KUclelc Acid Accession lis MM_OS7090.1 

Coding sequences 29-715 



75 1 11 21 31 41 51 

I I I I I I 

CTGATGGGCG CTCCTGGTGT T6ATAGAGAT GGAACTTGGA CTTGGAGGCC TCTCCACGCT 
GTCCCACTGC CCXnGGCCTA GGCQGCAGGC TCCACTTGQT CTCTCCGCMC AGCCTGCCCT 
OTGGCCCACC CraOCCGCIC TOGCTCXGCT GAGCAGCGTC GCAGAGGCCT CCCTGGGCTC 

80 CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GTCCTGGCGT CCCCCGCOGG 
CCACCTGCCG GGGGGACGCA OGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 
GCAGCCTTCT CGGCCCGOGC CCCCGCCGCC TGCACCCCCA TCTGCTCTTC CCCGCGGGGG 
CCGCGCGGOG CGGGCTGGGG GCCCGGGCAG CCGCGCTCGG GCAGCGGGGG OGCGGGGCTG 
CCGCCTGCGC TCGCAGCTGG TGCOGGTGCG CGCGCTCGGC CTGGGCCACC GCTCCGACOA 

85 GCTGGTGCGT TTCCGCTTCT GCAGOGGCTC CTGCCGCCGC GCGCGCTCTC CACAOGACCT 
CAGCCTGGCC AGCCTACTGG GCGCCGGGGC CCTGCGACCQ CCCCCGGGCT CCCGQCCOSt 
CAGCCAGCCC TGCTGCCGAC CCACGCGCTA OGAAGCGGTC TCCTTCATGG ACGTCAACAG 
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15 
20 
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CACCTGGAGA ACCGTGGACC 
CGCTCCAGGG CTTTGCAGAC 
GCAGAGTCXX; ACTAGCOUSC 
CTACCGGTGG GTCATGOATA 
CAGAGCCCTC ACCCTGCGGA 
CTTCGOACCC ACTTCTCACA 
TCTGATGAAC ACTACAGTC3G 
TTTGAAGGAC ACATATTGCA 



GCCTCTCCGC 
TGGACCCTTA 
GGCCTCAGCC 
TCATCCCCX3A 
TCCCAGCCTA 
GACTCTGGCA 
CTGAGOCATC 
GTTGCTTGGT 



CACCGCCTGC 
CCGGTGGCTC 
AGGGACGAAG 
ACAGGTQAAG 
AAAGACACCA 
CTGGCCAGGC 



GGCTGCCTGG 
TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCTCAG 
CTCGAACCIQ 
CAGGCCCTGT 
TGTQCTGGAA 



GCTGAGGGCT 720 

GGACCCTCCC 7 BO 

TGAGAGGCCC 640 

CTAGCAGCCC 900 

CTATGGAGCC 960 

GGRCCCCTCC 1020 

AGGGACAQCA 1080 

CTGGCCTGTA 1140 



PCT/US02/12476 



21 



31 



41 



SI 



I 



111 
MEIiGIiGGLST LSHCPHPRRQ APIiQIiSAQPA IMPTUAIAIi I.SSVAEASU3 SAFRSSASKB 
GPPPVLASPA GHLPGGRTAR WCSGRAHRPP PQF8RPAPPP PAPPSAI.PRG ORAAHAGGPQ 
SSABAAGMIG CRLRSQLVPV RAIiGUSBRSD BUVWeXPCSB BOtRARSPHD LSItASW»U3 
AUIPPPCSRP VSQPCCRPTH YERVSFMDVM STWRTVDRLS ATACGCLG 



21 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CTACTGCRCC 
ATAGCAATTT 
CCGTGAACCT 
AGGTAACX3TT 
GATCTAATCA 
GCTAAAAATG 
TATTCTTCAT 
aVCTCTTAAG 
GIVATCtCMSQ 



CCTTTGGTTT 
TATGAATGCT 
TTCTTTTGCT 
ATGATCTAGT 
AACTTGAAAC 
CCTGGCTGTA 
GATTTTATGT 



I 

" TTCCTTTGGA 
TTAAGACTTC 
GCTTAAAAAT 
CTCATCTTGG 
CAGAAGCTAC 
ACGGAAGTA6 
ATAATAGCCA 
GTATTArrrCA 
TTATCACTTT 



31 
I 

RATCTCTTAC 
TACATTGCTT 
AATGTCAAAA 
TTTCCATATA 
TTCACTGGCT 
TGGTTGGTCC 
TTATTTGTTA 
AATTGCTATT 



CTTTCATTAG C 
TTTCmTAT 

TATGTrTTAG 
CTATTTTTGa 
AACAGTGATC 
AGTTTGAAAG 
TGCCTTTGTT 
ACTQTTCTTT 



TATCTGTGCT 
CTGCCTACTC 
TTTTTTGTGA 
ATGTTCATGT 
CTCTTATTAG 
ATGTAGCAOA 
TTATAGTTGA 



8eq ID MOi 283 ONA sequence 
nucleic Acid Accession U: Bo 
Coding Bequieiicei S64-1481 



Z OCTTGTGCTT 



ACITG6ATC3V 
GGACOGATGC 
CTTTCCCTGA 
GCAAACAAGG 
GGGGAGACAC 
AAGGGCTGQC 
CCTCCCQGCC 



AGTTCCCTCC 
TCACAATTCT 
TTCAGTGGCA 
GACAGGCCCT 
CGTACTAAAC 



AATGACTGTC 
CCTCTCCTCA 
GACCTGGTAA 
GGTAACATAT 
CAAAGTTGTC 
AAGCTTGCAA 



AAATTAGGTC 
CTAAAAGTGC 
AAATATATCX3 
TTATATAGGG 
TTCATGTACA 



TCCAOQOGTC GCCATGGCAA C 



ACTTGGOGGG 
CCGGTTTCTC 
CTATGACGGG 
OSCTGAGCTG 
AGTTTGACGA 
TGGATGAAQA 
AGGGGTGOSC 
GCTTACCAGA 
AACX3TGACCG 
AAGAAATGGA 
TTCAGAAAAA 
AGGAAAAAGC 
ATCAAGAATG 
AAAACAACAG 
ATGGTTGGAA 



CCAOATQTGG 
TGCTGACTGC 
CGChCGTGGG 
CAGCAATTCC 
GGACGACGGT 
TGCCCATGAT 



GCGGOGGGGC GCTGGGGGCC 
AGACCCAGGT 
CAGGGGCTGG 
ACCAGGTCGC 



AGGTGGTGCG 



GTTCTGGGAT ACAOCTaTAA ISO 

AAAGAAATCA 240 

GTTTCTGCX3T 300 

CRACACCACG 3fi0 

CCAGTGGCGT- 420 

ACTGAGGAGG 480 

CTGACAAAGT S40 

GGATTGGCCG 600 

GCCAGTTTOA 660 

TCTTCCTACX3 720 

TGCTCAGTCC 780 

COGOCGCTGC 840 



AAATGAACTG 
CAGGACCCCG 
CCTTCCTGCC 
TCTGCaGGCC 
GTGCTGGCCQ 
CCGCGGCTCC 
TACTTTTCCC 
CTCGGACTCC 
CTCGCCGTCG 
CCTTGGOCAC 



AATTCOQTGG 
GAAGAGTAAA 
AGCCAGGAGC 
AATAACAIGC 
CAGTCAATAA 



AAGCOGCCTG 
GCTGCAACTG 
AGAACGTGAA 
GAATGAGCAA 
AGCAAAGGAA 
GTTAAAGAAA 
CAAGCTGAAA 
AATGCGAAAC 
ACAGGTTTTT 
AAACCRATTC 
AGACCTGTGA 
AATCTTTGCC 
TTTTATCTGQ 



TCAOAGQCCA 
GTTGAATCAG 
ACACCATGGG 



AAAASAAAGA 
AAAAGAAAAO 
CTGGAGAAAG 
AAAAATQCIQ 



AGGTGTGGTT 
AGGAATTAAA 
TAATTGCTGA 
AAAGAQAACA 
AATACTTGCA 



CCrOftGAOGA 
AGAAGAACAG 
TATTGGCAAA 
TCAACAACTA 
AGAAAAGCAC 



ATGGRSTTAC 1020 

AAACAGGTGC 1080 

GAAAAAGAAG 1140 

GAAAAAAGAA 1200 



AGAAAAAGCA AAAGAAAAAT 1380 



ATAAACCTOO 
ACAGTG6AAA 
ATATGCXavCC 
TAAGTC3U3CC 
TTGGAACTCT 



GTTGATCTTG 



AGAGTTGAOQ 
GCATGTTGTT 
TATTAAAAOT 



AATGTGATTA 
AAGATATTTT 
TTSCAGAATA 
AC&CTQTAAT 



TTOCTATCCA 
TCCXAAAGAA 
ACACAAGTCA 
GTGCAGAATA 
TTTAAAAATC 
TTGACAAATA 
CTGQATTTTG 
AGTOGCTQAA 



AAOAGCTATG 
GAACCaGCCT 
GCTAAGGATC 
TCATCTCTGG 
CAAAGATAGC 



GCAATTTTTG 
GITTTTATAA 
TAT6TAAGAA 
CTAACAATTT 



AAAGAAAAOA 
AGTrrCAAOA 
QTTATBCCAA 
TTTATAATOC 
TATCAGGAAG 
TAATTCATAA 
GTATGTGGAA 
TTTTACTGCT 
CArrTOTATA 



11 



21 



31 



41 



I I I I 
MATRQUaWQ UU3IARAGPA GKRHPHRaSA SLNIAGQMKA AGHWGPTFPS 
PRSRPSSDSC SVPMTGARGQ GLEWRSPSP PLPLSCSNST RSUjSPLGHQ 
GEDEEOVDDB BOVDEDABDS EAKVASUtGH ESENNQBEQK 
TC RKIIAEEXHK 



1S60 
1620 
1680 
1740 

1860 
1920 
1980 



K EHBEXAAKEIi BKBYI-QEKMC EKyQBWUCXX HAEECBRKKK EKiOIHSKLKy 
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RRKKK 

Seq ID NO: 285 DNA sequence 
Nucleic Acid Accession #: Eos sequence 
S Coding sequence: 1-1746 

1 11 21 31 41 51 

ATGCXaCTGA AGCATTATCT CCTTTTGCrG GTGGGCTGCX: AAGCXTTOGGG TGCftGGGTTG 60 

10 GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCCaMSGG CCTCCCAGGT GGAOT6CAOC 120 

GGGGCACGCA TTGTGGCXiGT GCCCACCCCT CTGCCCTGQA AOGCC^TaAB CCTGCAGATC 180 

CTCAACACGC ACATCACTGA ACTCAATGAG TCCCCGTTCC TCAATATCTC AGCCCTCATC 240 

GCCCKSAGGA TTGAGAAGAA TGAGCTGTCG CGCATCACGC CrGGGGCCTT CCGAAACCTG 300 

GGCTCGCTGC GCTATCTCAQ CCTOSCCAAC AACAAGCTGC AOGTrCTGCC CATCGGCCTC 360 

15 TTCCaGGGCX: TGGACAGCCT TGAGTCTCTC CTTCTGTCC3V GTAACCAGCT GTTGCAGATC 420 

CAGCOGQCXX: ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGCRGTTGCA CGGCRACCAC 480 

CTGGAATACa TCCCTGACGG AGCCTTCGAC CACCTGGTAG GACTCACOAA GCTCAATCTG 540 

GGCAAGAATA GCCTC»CCCA CATCTCACCC AGGGTCTTCC AGCACC TGGG CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCAOS GATATCCCX3i TGGGCACTTT TGATGGGCTT 660 

20 GTTAACCTGC AGGRACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC COCTGSTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCX» ACAACCAOVT CTCCCAGCTG 780 

CCACCCaGCA TCTTCATGCA GCTGCCCCAO CTCavACCGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGABC TCTCTCTGGG GATCTTOGGG CCCATGCCCA ACCTGaX3GA GCTTTGQCTC 900 

TATOACRACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCCG CXaGTTGCAG 960 

25 (JTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACGGGCTA 1020 

AOSGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCRTGT TGGCCAACCT GCAQAACATC TCCCTGCAGA AC3UVTCGCCT CAGACAGCTC 1140 . 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CTQQAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGQCTS 1260 

30 TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCXSC TCCGCAACTG GCTCCTQCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGOXAfiC CAATGTCOGR 1380 

GGCCAGTCCC TCATTATCAT CAATQTCAAC GTTGCTGTTC C3U«KX3TCX3V TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATGGTAC CCAGACACAC CCaGTTACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1S60 

35 ATTCAGGTCA CTGATGACCQ CAGCX3TTTGQ GGCATGACCC AGGCCX3«3AG CGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGQ CATTGTOGCC CTGGCCTGCT CXCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCr GCAAQAAQAQ QAGCXAAGCT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCAGOGCTGa GQAATQATCO aACTGGW3GA CCTGGGftA Tr 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC OGTQATTGCT CTTTCTGGCC 1860 

40 CTAGATAAAO GTOTOCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 

CGTGCCGQAC CTTCCTACAA TCAGGAAGAT AGATCXUUVCT GGCX»TGGCA AAAGCCCTGG 1980 

GGATTTCCGA TTCATACCCX: TGGGCTTCCT TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2040 

ACCTGTCCTC CAAGAACaGC CTTCCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTCXJTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTCX3 2160 

45 CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTOTTTGrG CTATGGCTTG 2220 

ACCCAGCATG TCCCCICAAA TGAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CaTTTAACTG GTTTCTTAAG AGCCQTCAAT 2340 

CAGCCTGGTT TTGGGGAT6C TATGAAAGAG AOAAGSAAAA TCRTGCCGCT CAGTTCCTGG 2400 

AGACAGAAGA GCCGTCATCA GTGTCTCACr TaTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

50 CCCXXGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATACX3 GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTCTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2700 

aTGOGAACAT GATAGTGTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

55 CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GOAOAIGGGG GCTTCrGRAB RXGGACTTAC CTGGSACCTG CCCCCCATGA GCCAGGACGG 2880 

TCCCCCCAOX GTCAGCXrror OCAAAGGCCC CXSTGGCCaGG GGTGGAGGAG AATATGTGGG 2940 

TGTQGACAGG ATGGGAGACT GTG6CCT6AA CAGGAGRTTT TATTATATCT GGAGACCCTG 3000 

AGAQACCCTG AGACCTGGGG CACX3VTGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3060 

60 GTCCGTGCAQ CCACACCCTC TTCCCTGCCA GCAAGTTQTC TGCXK3CTCAT CGGAQGCCXJC 3120 

TCCGCCTCGA GCCTTCTATG GAC3GTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTGAAATCG CTCaGAGATG AGATCCTTTA ATTGAAAACG AAGTQTAACG 3240 

GAATCTAGTG TCTTTCTAAT GTOGTAAAAT TCTCCATCAA CATCAC3M3TC AGCTGGCAGC 3300 

TQAACTTCAO AATCTCACTT ACAGCAQGCQ ACACGGGGGT ACACCX3ATGG GTCA CACTG G 3360 

65 GTCTOGGGGC TOCXnXKJAGC TCCTCCTOCM TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCCAGGGTrA TTCTOCTCCT CGAGTCAC»G TCACAOGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACA C3VTATTCACA TGGCGCTCAR GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3S40 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AGQAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCCOATCGGC 3660 

70 TCTTATTAQC TCCCCGCTCC ACARGACACC TGTGCTTTGG AAATOCACCA CCAATCCGGA 3720 

TCGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCRCCAAT 3780 

CCCOATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAG6GCCA 3840 

CAGOAGCACQ TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TOTTTGCAAA CACTASTCCA CTTTGTAGCT TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3960 

75 AGATQAGGCC CGTCAGAOTC AASAGATGTC ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG QACATGCACC AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 4080 

CCAOAGCATG GCACATGAGC ATCACCCGCT GATGGTGQCC TGCTGTGCCT OGTGCCAACA 4140 

GGQGCATCCC GGCCCGTACC CCTCCAGACA QGAAGCATGQ GTTTGCCCAC AGACCTOTCQ 4200 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCT TTGTGCATAG GCACAAQTGG OCCAG GGCTg 4260 

80 GAGGGAGCTG GGAAACCTCA TCATCCGGTQ GGCCCTGCCA ATCTTAACCC ASAACCCTTA 4320 

GOTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

GAGSGCCACT GTCCTCAGAT GACACCACCC AGQAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATQIGR ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTGGAT GGAGCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

85 TACTAGAAAA GCTGASIGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATTCTCA 4620 

GGGCIGOAAT GAGCCGGCTO GTCCCCCAGA AAGCTGGAGT GGGGTACMSA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACM5TCCC ACGCCCATCT GGItfJTaGGAQ CTGGQAGTTA 4740 
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GTOTTGOAGA AGAAACAAC3V AAAGCCAATT AGMVCCACTA TTTTTAAAAA GTGCTTACm3 4800 

TGCACAGA.TA CTCTTCAAGC ACTGaAGGTG QATTCTCTCT CTAGCCCTCA GCACCCXTTGC 4860 

GGTAOGAOTG COGCCTCTAC CCACTTGTGA TGGGGTACAG AGGCACTTGC TCTTCTGCAT 4920 

GGTGTTCa^T AGGCTQGGAO rTTTATTTAT CTCTTCAAAC TTTGTACAAG AGCTCATGGC 4980 

TTGTCTTGGa CTTTCQTCAT TAAACXAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC 5040 

TTAGTCTTGG TCATCAGAAC CTCACTTGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 5100 

GOAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG TTTGTCCCTC TCATGGGAAT 5160 

TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

AACTTTTCAT GGACACAATT TCCACAACXTT TTCAGATGCT GATGTAGAGC TATTGGGAAA S280 

GAACTTCCAA ACTCAGGAAO TTTGCAGAGA GCAGACAGCT AGAGATAACT CGGGACCCAG 5340 

AQTTGGTCGA CAGATGTTAO ATGTATCCTA OCTTTTAGCC ATAAACCACT CAAAGATTCA 5400 

GCCCCCAOAT CCCKOVCfTCa GAACTOAATC TOOOTTOTTG GQAAGCCAGC A6TGGCCTTO 5460 

QGAAGGAAGC CATGGCTGTG GTTCAGAGAG GGTGGGCTGG CAAGCCACTT COGGGGAAAA SS20 

CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT TGTTCTCACC AACCCGCTGC 5580 

CTTCATGCTG CCTTCAAAGC TAGATCATGT TTGCCTTGCT TAGAGAATTA CTGC3kAATCA 5640 

GCICCCAGTGC TTGGCGATGC ATTTACAGAT TTCTAGGCCC TCACGGTTTT GTAGAGTGTQ S700 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT CTTCrGCTGG ATOCTGCTTa TAATCCATTT 5760 
GGTGTACAGA ATCAACAATA AATAATATAC ATGTAT 

Seq ID NO I 286 Protein sequence: 
Protein Acceaaion #s NP_570843.1 



MPLKHYLLUi VGCQAWGAGI. 
LNTHITELNB SPFUIISAIjI 
FOOIiDSLESL LLSSNQLLQI 
OKNSLTKISP RVFQHUaiU} 
FBMNHNUIRI. YIiSNNKISQL 
YDNHISSLPD NVFSNLRQLQ 
FHMLAHLQNI SLQNIIRUtQIi 
raifPWRCDSD ILPUmWLU. 
VPSYPETPWY PDTPSYPDTT 
lAAIVIGIVA LACSLAACVG 

Seq ID NO: 287 DKA sequence 
nucleic Acid Acceaaion #i HM_0O2362 
Coding sequence: 1..954 

11 21 31 41 51 

I I I I I 

AGCAGAAGAG TCAGCACTGC AAGCCTGAGG AAGOCGTTGA GGCCCAAOAA 60 

GCCTGGTGGG TGCACAGGCT CCTACTACTG AGGAQCAGGA GGCrGCTOrC 120 

CTCCTCTGGT CCCTGGCACC CTGGAGGAAG TGCCTGCTQC TOAGTCAQCA 180 

AGAGTCCTCA GGGAGCCTCT GCCTTACCC3V CTACCATCftG CTTCACTTGC 240 

CCAATGAGGG TTCCAGCAGC CAAGAAGAGG AGGGGCCAAG CACCTOGCCT 300 

CCTTGTTCCG AGAAGCACTC AGTAACAAGG TGGATQAGTT GGCTCATTTT 360 

AGTATCXJAGC CAAGGAGCTG GTCACAAAGG CSWSAAATGCT GGAGAGAGTC 420 

ACAAGCGCTG CTTTCCTGTG ATCTTCX3GCA AAGCCTCCX3A GTCCCTQAAG 480 

GCATTGACGT GAAGGAAGTG GACCCCGCCA GCAACACCTA CACCCTTGTC 540 

GCCTTTCCTA TGAIGGCCTG CTGGGTAATA ATOUSRTCTT TCCCAAOACA 600 

TAATCGTCCT 6GGCACAATT QCaATGQAGG GCGACAGCGC CTCTGAGGAG 660 

AGGAGCTGGG TGTK2AT6GGG GTOTATGATO GGAGGGAGCA CACTOTCTAT 720 

GGAAACTGCT CACCCAAOAT IGGGIGCAGG AAAACTACCT GGAOTACOGG 780 

GCAOTAATCC TGOQOGCTAT GAQTTCCTOT QGOaTCCAAG GGCTCTOGCT 840 

ATGTGAAAST CCTGGAGCAT GIGGTCAGGG TCAATCCAAG AGTTOGCATT 900 
CCCTGCGTGA AGCAGCTTTa TTAGAGQAGG AAGAGGGAGT CIGR 



288 Protein sequence: 
protein Accession «: HP_002353.1 

1 11 21 

I I I 

MSSEQKSQHC KPEEGVEAQE EAIjGLVGAQA 
GPPQSPQGAS ALPTTISFTC WRQPNBGSSS 

LLRKYRAKEIi vtkaemiuerv iknykrcfpv 

TC2/3I.SYDGI. LGNNQIFPKT GIiIiIlVUGTI 
GEPRKIiI.TQD WV(3ENVl4EYR QVPGSNPARY 
AYPSIiREAAL LEEEEGV 

Seq ID NO I 289 DNA sequence 
Nucleic Acid Acceaaion »: NM_002362 
Coding sequence: 46.. 1344 



11 21 31 41 51 

I I I I I 

COQCGaCCGC GCCCTOBTTG GGTCCCCaiCT GCTCTCGGGG GGQCCAIGGA OOAGGCOGTO 60 

GOOSACCTGA A6CAGGCGCT TCCCT6TGTG GCCQAGTCGC CAAOGGTCCA CQTGGAGOTG 120 

CATCAGCGCG GCAGCAGCAC TGCAAAGRAA GAAGACATAA ACCTGAQTOT TAGAAAGCTA 180 

CTCAACAGAC ATAATATTOT GTrCGGTOAT TACACATGGA CTGAGTTTGA TGAACCTTTT 240 

TTCACCXGAA ATGTGCAGTC TGTGTCTATT ATTGACACAG AATTAAAGGT TAAAOACTCA 300 

CAGCCCATCG ATTTGAGTGC ATGCACTGTT GCACTTCACA TTTTCCAGCT GAATGAAGAT 360 

GGCCCCAGCA GTGAAAATCT GQAGGAAGAG ACAGAAAACA TAATTGCAGC AAATCACTGG 420 

GTTCTACCTG CAGCTGAATT CCATGGGCTT TGGGACAGCT TGGTATACGA TGTGGAAGTC 480 

AAATCCCATC TCCTCGATTA TGTGATGACA ACTTTACTGT TTTCAGACAA GAACOTCAAC 540 



21 31 41 51 

I I I I 

ayhgcpsect csrasqvect garivavptp lpwnamslqi 6o 

AliRIEKNELS RITPGAPRNb GSUlYIiSUVM NXLQVLPIGL 120 

QPAHFSQCSH LKELQLHGNH IiEYIPDGAFD HXiVGLTKLNL 180 

VUUiYENRIiT DIP^faTFDGL VHIiQEIiAIiQQ MQIGLLSFGIi 240 

PPSIPMOLPQ UIHI.TLEGHS UCBLSI.QIFO SKSSLttZlML 300 

VXiIbSRHQIS FISPOAPHGL TELRBLSMT HAIKJDLDGNV 360 

FGHIFANVMG liMAIQLCiNNQ I.ENI.PU3IFD HIiGKLCELRL 420 

HQPRLGIDTV PVCFSPAHVR GQSLIIINVN VAVPSVHVPE 480 

SVSSTTELTS PVEDYTDLTT IQVTDDRSVH GHTQAQSGLA 540 
CCCCKKRSQA VLMC^tKAPllE C 



GGTCCTCCCC 
TGGAGGCAAC 
GACGCAGAGT 
CTGCTCCGCA 
ATCAAAAATT 
ATGATCTTTG 



CAGGTACCCQ 
GAAACCAGCT 
GCCTACCCAT 



PTTEEQEAAV SSSSPI.VPGT LEEVPAAESA 60 

QEBEGPSTSP DAESLFRSAIi SNKVDEliAHP 120 

IFGKASESI.K MIFGIDVKBV DPAS1}TYTI.V IBO 

AMEGDSASEB BIWEBLGVMG VYDGREHTVY 240 

EFLMGPRAIiA KT3YVKVLEH WRVNARVRI 300 
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AGCAACCTCA TCACCTGGAA CCGGaTGOTG CTGCTCCAOS GTCCTCCTGG CACTGQIVAAA 600 

ACATCXXTTOT GTAAAGa3TT AGCCCAGAAA TTGACAATTA GACTTTCAAG CAGQTACC6A SGO 

TATGGCCAAT TAATTC3AAAT AAACAGCCAC AGCCTCTTTT CTMGTGQTT TTOGGAAAGT 720 

GGCAAGCTGG TAACCAAGAT GTTTCAGAAG ATTCAGGATT TOATTGATGA TAAAGACGCC 780 

CTGGTGTTOG TGCTGATTGA TGAGGTGGAQ AGTCTCACAG CCCCOCXSAAA TGCCTGCAGG 840 

aasGoauxXi agccatcaga tgccatccgc gtggtcaatg ctgtcttgac ccaaattgat 900 

CAGATTAAAA GGCATTCCAA TGTTGTGATT CTGACCACTT CTAACATCAC CGAGAAC5ATC 960 

GACGTGGCXrr TCGTGGACAG GGCTGACATC AAGCAGTACa TTGGGCCACC CTCTGCAaCA 1020 

GCCATCTTCA AAATCTACCT CTCTTGTTTG GAAGAACTGA TQAAflTQTC3V GATCATATAC 1080 

CCTCGCXavGC AGCTGCTGAC CCTCCGAGAG CTAOAGATGA TTGGCTTCAT TOAAAACAAC 1140 

GTGTCAAAAT TGAGCCTTCT TTTGAATGAC ATTTCAAGQA AGAGGGAGGG CCTCftOCQGC 1200 

CGGGTCCTGA GAAAACTCCC CTTTCTGGCT CATGOGCTGT ATGTCC3U3GC CXXXACCGTC 1260 

ACX^TAGAGG GGTTCCTCCA GGCCCTGTCT CTGGCAGTGO ACAAGCAGTT TOAAQAGAQA 1320 

AAGAAGCTTG CAGCTTACAT CTGATCCTGG GCTTCCCCAT CTGGTGCTTT TCCCATGGAG 1380 

AACACACAAC CAGTAAQTQA GGTTGCCCCA CACAGCCGTC TCCCAGGGAA TCCCtTCTGC 1440 

AAACCAAA06 TTACTTAGAC TGCAAGCTAG AAAGCCRCCA AGGCCAGGCT TTGTTAAAAG IS 00 

AAOTOTATTC TATTTATGTT GTTTTAAAAT GCATACTQAG AGACAAACAT CTTGTCATTT 1S60 

TCACTGTTTO TAAAAGATAA TTCAGATTQT TTGTCTCCTT GTGAAQAACX: ATCGAAACCT 1620 

GTTTGTTCCC AGCXCACCCC CAOTGGATGG GATGCATAAT GCCAGCAAGT TTTGTTTAAC 1680 

AQCAAAAAAG g3u«3ATTAAT GCAGGTOTTA TAQAAOCCAG AAGAGAAACT GTSTCACCCT 1740 

AAAGAAGCAT ATAATCATAG CATTAAAAAT GCACACATTA CTCCAS GTOg AAOGTGGCAA 1800 

TTGCTTTCTQ ATATCXGCTC GTTTGATTTA GTGCAAAAAT GTTTTCAAQA CTATTTAATG 1860 

GATOTAAAAA AGCCTATTTC TACATTATAC CAACTGAGAA AAAAATGGTC GGTAAAGTGT 1920 

TCTTTCATAA TAAATAATCA AGACATGOTC CCATTTGCAO GAAAAGTGCA GACTCTOAGT 1980 

GTTCCAGGGA AACACATGCT GGACATCCCT TGTAACCCX3G TATGGGCGCC COTGCATTGC 2040 

TGGGATGTTT CTGCKCACGG TTTTGTTTCT GCAATAACXTT TATCACATTT CTAATG AGGA 2100 

TTCACATTAA TATAATATAA AATAAATAGG TCAGTTACTQ GTCTCTTTCT GCCGAATGTT 2160 
ATOTTTTGCT TTTATCTCAC AQTAAAATAA ATAIAATTAA AAA 



X 11 21 31 41 51 

I I I I I 1 

MDEAVGDLKQ ALPCVAESPT VHVKVHQRGS STAKKEDIKL SVRKLUJRHN IVFGDYTWTE 
PDEPPLTRNV QSVSIIDTEI. KVKDSQPIDI. SACTVAMIP QLNEDQPSSE NLEEETENII 
AAHHWVLPAA EFHGLMDSLV YDVEVKSHIiL DYVMTTLLFS OKSWSNIiIT WNRWLliHGP 
PaTaXXSLCK ALAQKLTIRI. SSRYRYGQI.I EIHSHSLFSX WFSESGKLVT KMFQKIQDH 
DDKDALVFVI. 1DEVESI.TAA DNACKAGTGP SDAIHWWAV IiTQIOQIKRH SNWILTTSN 
ITBKIDVAFV DRAOIKOYIQ PPSAAAIFKI YbSCIiEELMK GQIiyPROQL LTLRELBMIG 
FIENNVSKLS LLLNDISRKS EGLSQRVLRK LPFLAHALYV QAPTVTIEGP I«AI.SIAVDK 
QFEERKKIAA YI 



Seq XO NO: 291 DMA sequence 
Nucleic Add Accession #: NM_ 
: 77-1372 



1 11 21 31 41 SI 

GTCCCCGCAG CGCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCGCCG CCGTCTAGC6 60 

CCXXKACXrrC GCCACXZATGA GAGCCCTGCT GGCGCGCCTG CTTCTCTGCG TCCTGGTOOT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGGAGGAACA TGTGTGTCCA ACAA6TACTT CTCCAACATT CACTGQTGCa ACTQCCXaiAA 240 

GAAATTOGGA GGGCAGCACT GTGAAATAGA TAAGTCAAAA ACCTGCTATG AGOGSAATGQ 300 

TCACTTTTAC OGAGGAAAGG CCRGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAAOQTACCA TGCXXAC3W3A TCTGATGCTC TTCAGCTGGG 420 

CCKSGGGAAA CATAATTACT GCAGGAACXX: AOACAACXSM AGGOSACCCT GGTGCTATOT 480 

QCAGGTGGaC CTAAAGCCXJC TTGTCCAAGA GTGC3VTGGTQ CATQACTGCG CAOATGOAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTQT GGCCaAAAGA CTCTOAGGCC 600 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATOGAG AACCAOCCCT GGTTTGOGfOC 660 

CATCTACAGG AGGCACCGGQ GGGGCTCTGT CACCTAOSTG TGTGGAGGCA GCCTCATCXG 720 

CCCTTCCTGG GTGATCAGCG CCACACACTG CTTCATTGAT TACCXAAAOA AGGAQOACTA 780 

CATCGTCTAC CTGGGTCGCT CAAGGCTTAA CTCCAACACX3 CAAGGGGAGA TGAAGTTTGA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAQCQCTQAC ACGCTTGCTC ACCa^CAACGA 900 

CATTGCX:TTG CIOAAGATCC GTTCCAAGGA GGGC3M3GTGT GCGCAGCC31T CCCX3QACTAT 960 

ACAOACCATC TCCCTCCCCT CGATQTATAA CGATCCCCAG TTTGGCACAA GCTGTGAQAT 1020 

CACTCGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT CCGGAGCaGC TGAAAATGAC 1080 

TGTTOTGAAG CTGATTTCCC ACCX3GGAGTG TCAGCAGCKC CACTACTACX3 GCXCTOAAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTQACCX: CCAAXGGAAA ACACATTCCT GCCAGGOAQA 1200 

CTCAGGGGQA CCCCTCGTCT GTTCCCTCCA AGGOCGCATO ACTTTGACTG GAATTGTGAa 1260 

CTGGGGCCGT GGATGTGCCC TGAAGGACAA GCCAGGOOTC TAOOGAGAO TCTCACACTT 1320 

CTTACCCTGG ATCCGCAGTC ACACCAAGGA AGAGAATGGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CGGGCACCAC CCGCTTTCTT GCTGGTTGTC ATTTTTGCAO TAGAGTCATC 1440 

TCCATCAGCT GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGGAT TTGCrTGTGG ISOO 

CACCACCAGG GTGAACGACA ATAGCTTTAC CCTCACGGAT AGGCCTGGGT GCTGGCTGCC 1560 

CAGACCCTCT GGCCAGGATG GAGGGGTGGT CCTGACTCAA CATOTTACTG ACCAGCAACT 1620 

TGTCTTTTTC TGGACTCAAG CCTGCAGGAG TTAAAAAGGG CAGGGCATCT CXTTOTGCATG 1680 

GGCTCGAAGG GAGAGCCAGC TCCCCCGACC GGTGGGCaTT TGTGAGGCCC ATGGTTQAGA 1740 

AATGAATAAT TTCCCAATTA GOAAGTGTAA GCMCTQWSG TCTCTT6AS0 GAGCTTAGOC 1800 

AATGTGGGAG CAGOGGTTTG GGGAGCAGAQ ACACTAACQA CTTCAGGGCA GC5BCTCTQAT 1860 

ATTCC3VTGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT 'iOTTGTOTGa 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGGTCTTTCT GGAGAGGTTA TAGGTCACTC 2040 

CTGGGOCCTC TTGGGTCXICC CACS3TGACAG TGCCTGGGAA TGTACTTATT CTGCAGCATG 2100 

ACCTGTGACC A6CACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCaCTQGG TGGGQTGAGG ACX»CTCCrT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAOTQ 2280 
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ATCAATAAAA TGTGATTTTT CTGA 



I I I I I I 

HRALLARLLL CVLWSDSKG SNELHQVPSN CDCUfGGTCV SNKYFSNIHW CWCPKKFGGQ 
HCEIDKSKTC YEGHGHFYRG KASTDTKGRP CLPWNSATVL QQTYHAHRSD ALQI/5LGKHN 
YCRHPDMRRR PWCWQVGLK PLVQECMVHD CADGKKPSSP PEELKFQCGQ KTLHPRFKII 
GOEFTTIEMQ PWFAAIYHHH HGGSVTYTCG GSLISPCWVI SATHCFIDYP KKEDYIVYLG 
RSRIiNSmXK3 EMKFEVENLI LHXDYSASTI. AHHHDIALLK IRSKEGRCAQ PSRTIQTICIi 
PSMYNDPQFQ TSCBITGPGK BHSTDVLYPE QLKMTVVKI.I SHREt3QQPHY YGSBVTTKML 
CRADPQHKTD SCXJGDSGGPI. VCSLQGBKTL TOIVSHGRGC ALKDKPGVYT RVSHFLPWIR 
SHTKEENCaiA L 

Seq ID NO J 293 DNA sequence 
Nucleic Acid Accession 8: NM_001498 
coding sequence! 93.. 2 006 



GGCACGAGGC TGAGTGTCCG TCTC53CGCCC GOAAGCGGQC GACCGCCGTC AGCCCGGAGG 60 

AGOAGGAGGA GGAGGAGGAG GAGGGGGCGG CCATGGGGCT GCTGTCCCAG GGCTOOCCOC 120 

TGAQCTGGGA GGAAACCAAG CGCCATGCCX3 ACCACGTGCG GCXMSCACGGG ATCCTCCAGT 180 

TCCTGCACAT CTACCACGCC GTCAAGGACC GGCACAAGGA CGTTCTCAAG TGGGGCGATG 240 

AGOTGGAATA C»TGTTGOTA TCTTTTGATC ATGAAAATAA AAAAGTCCGG TTGGTCCTGT 300 

CTOQOaAOAA AerrCTTGAA ACTCTGCAAG AGAAGGGGGA AAGGACAAAC «»AACCATC 360 

CTACCCTTTC QAGACCftGAG TATGGGAGTT ACATGATTOA AGGGACACCA GGACAGCCCT 420 

ACGGAGGAAC AATGTCCGAG TTC34ATACAG TTGAGGCCAA CATGCGAAAA CGCCGGAAGG 480 

AGGCTACTTC TATATTAGAA GAAAATCAGG CTCTTTGCAC AATAACTTCA TTTCCX3U3AT 540 

TAGGCTGTCC TGGGTTCACA CTGCCCGAGG TCAAACCCAA CCCAGTGCSAA OaAGOAaCTT 600 

CCAAGTCCCT CTTCTTTCCA GATGAAGCAA TAAACAAGCA COCTOQCTTC AGTACCTTAA 660 

CAftGAAATAT CCGACATAGG AGAGGAQAAA AGQTTGTCAT CAATQTACCA ATATTTAAGG 720 

ACAAGAATAC ACCATCTCCA TTTATAGAAA CATTTACTGA GGATGATQAA GCTTCSUWSGG 780 

CTTCTAAGCC GQATCATATT TACATGGATG CCATGGGATT TGGAATGGGC AATTGCTGTC 840 

TCCAGSTCAC ATTCCAAGCC TGCAGTATAT CTGAGGCCAG ATACCTTTAT GATCAGTTGG 900 

CTACTATCTG TCCAATTGTT ATGGCTTTGA GTGCTGCATC TCCCTTTTAC CGAGGCTATG 960 

TGTCAQACAT TGATTGTCX3C TGGGGAGTGA TTTCTGCATC TGTAGATGAT AGAACTCGGG 1020 

AGGAGCGAGG ACTGGAGCCA TTGAAGAACA ATAACTATAG GATCAGTAAA TCXXGATATG 1080 

ACTCAATAGA CAGCTATTTA TCTAAGTGTO GTOAOAAATA TAATOACATC QACTTOACQA 1140 

TAGATAAAGA GATCTACX3AA CAGCTOTTGC AQGAAGQCAT TGATCKTCTC CTOGOCCAaC 1200 

ATGTTGCTCA TCTCTTTATT AGAGACCCAC TGACACTOTT TQAAGAOAAA ATACACCTOO 1260 

ATGATGCTAA TGAGTCTGAC CATTTTGAGA ATATTCAGTC CACAAATTGG CAGACAATGA 1320 

GATTTAAGCC CCCTCCTCCA AACTCRGACA TTGGATGGAG AGTAGAATTT CXSACCCaTGG 1380 

AGGTGCSVATT AACAGACTTT GAGAACTCTG CCTATGTGGT GTTTGTGGTA CXGCTCACCA 1440 

GAGTGATCCT TTCCTACS^ TTGGATnTC TCATTCCACT GTCAAAGGTT GATQAQAACA 1500 

TGAAGGTAGC ACAGAAAAGA GATGCTGTCT TGCAGGQAAT GTTTTATTTC AGGAAAGATA 1560 

TTTGCAAAGG TGGCAATGCA GTGGTGGATG GTTGTGGCAA GGCCCAQAAC AGCACGGAGC 1620 

TCGCTGC3U3A GGAGTACACC CTCATGAGCR TAGACACCAT CATCAATGGG AAGOAAGOTQ 1680 

TGTTTCCTGQ ACTOATCXrCa ATTCTGAACT CTTACCTTGA AAACATGQAA GTQGATGTGG 1740 

ACACCAGATG TAGTATTCK} AACTAOCTAA AOCTAATTAA OAAGAGAGCA TCTQGAGAAC 1800 

TAATGACAGT TGCCAGATGG ATGAGGGAGT TTATCGC301A CCATCCTGAC TACAAGCAAQ 1860 

ACAGTGTCAT AACrOATOAA ATGAATTATA GCCTTATTTT GAAGTGTAAC CAAATTQCAA 1920 

ATGAATTATG TGAATGCCCA GAGTTACTTG GATCAGCRTT TAGGAAAGTA AAATATAGTG 1980 

GAAGTAAAAC TGACTC»TCC AACTAGACAT TCTACAGAAA GAAAAATGCA TTATTGACX3A 2040 

ACTGGCTACA OTACCATGCC TCTCAGCCCH TGTQTATAAT ATGAAGACCA AATGATAGAA 2100 

CTGTACTGTT TTCTGGGCCA GTGAGCCAGA AATTGATTAA GGCTTTCTTT GGTAGG TAAA 2160 

TCTAGAGTTT ATACAGTGTA CATGTACATA GTAAAOTATT TTTGATTAAC AATGTATTTT 2220 

AATAACATAT CTAAAGTCRT CATQAACH3G CTIGTACATT TTTAAATTCT TACTCTGGAG 2280 

CAACCTACIQ TCTAAGCAST TTTGTAAATG TACTGGTAAT TGTACAATAC TTGCaTTCCA 2340 

GAOrrAAAAT GTTTACTOTA AATTTTTGTT CTTTTAAA0A CTACCTOGGA CCTGATTTAT 2400 

TGAAATTTrr CTCTTTAAAA ACATTTTCPC TCXJTTAATTT TCCTTTGTCA TTTCCTTTQT 2460 

TGTCTACATT AAATCACTTQ AATCC&TTOA AAGTGCTTCA AQGSTAATCT TQaOTTTCTA 2S20 

GCACCTTATC TATGATGTTT CrTTTGCAAT TGGAATAATC ACTTGGTCAC CTTGCCOOUV 2580 
GCTTTCCCCT CTOAATAAAT ACCCAnOAA CTCTGAAAAA AAAAAAAAAA AARA 



1 11 21 31 41 51 

I I I I t I 

MGLLSQGSPL SWEETKRHAD HVRRHGILQF IBIYHAVKDR HKDVLKWGDE VEYMLVSFDH 
EHKKVRLViS GBXSrLStLfXB. KQERTHPMRP TI.WRPBXGSY MIBSTPGQPY GGTMSEFNTV 
BANMRKRRKE ATSIliSNQA LCTITSFPRL GCPGFTLPEV KPNPVEGGAS KSIiFFPDEAI 
1IKHPRPSTI.T HMIRHBSGBK WINVPIFKD KHTPSPFIET FTEDDEASHA SKPDHIYMDA 
MGEGMGNCO' QVTFQACSIS EARYIiYDQIA TICPrVMALS AASPFYRGW SDIDCKWGVI 
aASVDDRTHE BRGLEPUCNM NYRISKSRVD SIDSYLSKCX3 EKYNDIDLTI DKEIVEQLLQ 
EGIDHIiAQH VAHLFIRDPL TI.PBBKIHLD DANESDHFEN IQSTOMQTMR PKPPPPNSDI 
GWRVEPRPME VQLTDFENSA YWFWLLTR VILSYKLDFL IPLSKVDEWM KWAQKRDAVTi 
QGMFYPRKDI CKGGHAWDG CX3KAQNSTEL AAEEYTI*ISI DTIINOKEGV FPGUPILNS 
YLENMEVDVD TRCSII^nfLK LIKKRASGEI. MTVARWMREP lANHPDYKQD SVITDEMHYS 
LIIiKCKQIAN ELCECPBIiG SAFRKVKafSG SKTDSSN 



296 



75 
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Seq ID NOs 295 DHA sequence 
Nucleic Acid Accession #: r 
Coding sequence: 247-816 

1 11 21 31 41 51 

I I I I I I 

AGTGTTCX3GC TGGGC3CA£3GC ACGCTGTGGC TGGCTACTTC CCTTCCTCXX ATCCCCCTTG 60 
GQCCAAACGG GATCGGTGCT TCTGGTGAGA OXXrrCCCCA TGCACATCaC TCCCAGGTGC 120 
CCTAGGGGGC ACATTTCX;CA C3VACTCCX»G AGGGC3W3QTT TCTAGAAAGT GCC3VCCAGTG 180 
GGGAGGOGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG COGTCTCTCC TCCA GCAAGG 240 
OAAACAATGA CCGATAAAAC AQAOAAOGrG GCIGTAGATC CTGAAMTTGr GTTTAAAGCT 300 
CCCAQGQAAT GrGACAQTCC TTOSTATCAG AAAAOOCAiBA GQATGGCCCT QTTGGCAAGG 360 
AAACAAGGAG CAGOAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGA AAAGAAGCTT 420 
ATGACAQQAC ATGCTATTCC ACCCAGCCAA TTGQATTCTC AGATTGATGA CTTCACTGGT 480 
TTCAGCAAAG ATAQOATGAT GCAGAAACCT GGTAGCAATG CACCTGTGGO AGGAAACGTT 540 
ACCAGCAGrr TCTCTGGAGA TGACCTAGAA TGCAGAGAAA CRGCCTCCTC TCCCAAAAGC 600 
CAACGAGAAA TTAATGCTGA TATAAAACGT AAATTAGTGA ACGAACTCCX3 ATGCGTTGGA 660 
CAAAAATATG AAAAAATCTT CX3AAATGCTT OAAGGAGTGC AAGGACCTRC TGCAGTCAOG 720 
AAGCGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACQ AGACTTTGTT 780 
AAGCACXHTA AOkAGAAACT GAAACQTATG ATTTGAGAAT ACTTGTCCCT GGAfiOATTAT 840 
CACACXXrCAA ATGCATAATC TOQTTAATGA TTGAGGAGAG AAAAOGATCA GATTGCTOTT 900 
TTCTACSVATS GASCRGQATA TTGCTGAAGT CTCCTGGCAT ATQTTAOCGA ATCAAATAGC 960 
CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATOTTC TTTTTCCCAA AGCaVTTTTAT 1020 
? TAGATATTAT 1080 



1 11 21 31 41 51 
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HIOKTEKVAV DPSTVFXRSR ECDSPSYQXR QRMAIiLARKQ OAGDSIiIAGS AMSKEKKLMT 
OKAIPPSQLD SQIDDFTGFS KDRMMQKPOS HAPVGGHVTS SFSGDDLECR ETASSPXSQR 
BINADIKRKI. VKELRC\R3QK YEKIFQ4LEa VQaPTAVRKR FFBSIIKEAA ROOtSDFVKH 
UCKKIjKSMI 

Seq ID NO: 297 DHA sequence 
HUcleic Acld Accession •> Ea 
: 247-815 



25 



35 
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I I I I I I 

AGTGTTCGGC TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 60 

45 GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTX TCTAGAAAGT GCCACCAGTG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG COGTCTCTCC TCCAGCAAGG 240 

GAAACAATGA CCGATAAAAC AQAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 300 

_ _ CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GQATGGCCCT OTTGGCAAGG 360 

50 AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TOTCCAAAOA AAAGAAGCTT 420 

ATGACAGOAC ATGCTATTCC ACCCAGCCAA TTQGATTCTC AQATTGATGA CTTCACTGGT 480 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGXAGCAATG CACCTOTGOG AGGAAAOGTT 540 

ACCAGCAGTT TCTCTGGAOA TQACCTAQAA TGCAGAGAAA CAGCCTCCTC TCCCAAAAGC 600 

CAACAAGAAA TTAATGCTGA TATAAAACGT AAATtAOTSA AGGAACTCCG ATGCGTTGGA 660 

55 CAAAAATATG AAAAAATCTT OGAAATQCTT GAAGQAGTGC AAGGACCTAC TGCAGTCAGG 720 

AAACGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACQ AGACTTTGTT 780 

AAGCACCTTA AGAAOAAACT GAAAOGTATG ATTTGAGAAT ACTTGTCCCT GGAGG ATTAT 840 

CACACCCCAA ATGCATAATC TCATTAATQA TTGAGGAGAG AAAAOGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGGATA TTGCTGAAGT CTCCTGGCAT ATGTTACCGA ATCAACTGGC 960 

60 CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATOTTC TTTTTCCCAA AGC6TTTTAT 1020 

TTGAAAGGAT AACTTGTGTT TTaGTTATTT TOTATTOXft CCTGTGCIGa TAGATATTAT 1080 
TAACCCATTA GGTAAATACT ATTACAGTOQ W3QTTTCTGC A 



: Eos sequence 



Seq ID MD: 299 DNA sequence 
Nucleic Acid Accession «> Be 
Coding sequencei 247-815 



1 11 21 31 41 51 

80 ioTGTTOOOC TGGGGCAGGC AGGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 
GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 
CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 
GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG COGTCTCTCC TCCAG CAAGG 
„_ GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT QTTTAAACGT 
85 CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 
AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGC AAAGAGCTTA 
TGACAGGACA TGCTATTCCA CCCAGCCAAT TGGATTCTCA GATTGATGAC TTCACTGGTT 
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TCAGCAAAGA TMGATGATQ CAGAAACCTG GTAGCAATGC RCCTGTGGGA GGAAACGTTA 540 

CCAGCAGTTT CTCTGGAGAT GRCCTAGAAT GCRGAGAAAC flGCCTCCTCT CCCAAAAGCC 600 

AACAAOAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA GGAACTCCGA TGCGTTGGAC 660 

AAAAATATGR AAAAATCTTC GAAATGCTTQ AAGGAGTGCA AC3GACCTACT GCAGTCAGGA 720 

AAOGaVTTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATQ TATGAGACQA GACTTTGTTA 780 

AGCACCTTAA GAAGAAACTG AAACGTATGA TTTGAGAATA CTTGTCCCTQ GflGGATTATC 840 

ACACCCCAAA TGCATAATCT CATTAATGAT TGAGGAGAGA AAAGGATCAG ATTGCTOrTT 900 

TCTACAATGG AGCAGGATAT TGCTGAAGTC TCCTGGCATA TGTTRCOGAA TCAACTGGCC 960 

TTCCAGAGGC TAAOAAATTT CTGTTAOTAA AAQATGTTCT TTTTCCCAAA GOGTTTTATT 1020 

TGAAAGGATA ACTTGTGTTr TGGTTATTTT GTATTCCCAC CTGTGCTGQT AQATATTATT 1080 
AACCCATTAG GTAAATACTA TTACaOTOGT GGTTTCTGCA 



I I I I t 

/ DPETVFKRPR ECDSPSYQKR QRMALIiARKQ OAGDSLIAGS AMSKAKKLMT 

OHAIPPSQU} SQIODFTGFS XDHIWQXPGS NAPVGGMVTS SFSGDOIiECR BTASSPKSQQ 
EINADIKRKIi VKBLRCVGQX YEKIFEMIiSQ VOGFTAVRXR FPBSIIKBAA KCMRSOFVKB 



Seq ID NO I 301 ONA sequence 
Nucleic Acid Accession Eo 
Coding sequence: 247-812 

1 11 21 31 41 51 

I ) I I i 1 

AGTGTTCGGC TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTO 
GGCCAAACGQ GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 
CCTAGGGGGC ACATTTCCCA CRACTCCCAG AGGGCAOGTT TCTAQAAAGT GCCACCAGTG 
GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG COGTCTCTCC TCCAGCAAGG 
GAAACAATGA CCGATAAAAC AOAGAAGGTO GCTGTAQATC CTQAAftCrGT OTTTAAAOGT 
CCCAGGGAAT GTGACAOTCC TTOOTATCAG AAAAGGCAGA GOATGOCCCT OTTCOCAAGO 
AAACAAGGAG CAG6AGACAO CCTTATTGC3V GGCTCTGCCA TOTCCAAAGA AAAGAGCTTA 
TGACAGGACA TGCTATTCCA CCCAGCCAAT TGQATTCTCA GATTGATGAC TTCA CTGGT T 
TCAGCAAAGA TGGGATGATG CAGRAACCTG GTAGCAATGC ACCTGTGQGA GQAAATGTTA 
CCAGCAATTT CTCTGGAGAT GACCTAGAAT GCAGAGGAAT AGCCTCCTCT CCCAAAAGCC 
AACAAGAAAT TAATGCTGAT ATAAAATGTC AAGTAGTGAA GOAAATCCGA TGCCTTGGAC 
AATATGAAAA AATCTTCGAA ATGCTTGAAG GAGTGCAAC3G ACCTACTGCA GTCAGGAAAC 
GATTTTTTGA ATCCATCATC AAGGAAGCAQ CAAGATGTAT GAGACGAGAC TTTGTTAAGC 
ACCTTAAGAA GAAACTGAAA CXSTATQATTT SAGAATACTT GIOCCTGGAG GAT TATCAC A 
COCCAAATGC ATAATCTCRT TAATGATTGA GGAGAGAAAA 6GATCAGATT GCTOrTTTCT 
ACAATOOAOC AGQATATTGC TGAAGTCTOC TOGCATATGT TAOCGAATCA fcCIGG CCrXC 
CAGAGGCTAA GAAATTTCTG TTAGTAAAAG ATGTTCTTTT TCOCAAAGOQ TTTTATTTOA 
AAGQATAACT TGTGTTITGG TTATTTTGTA TTCCChCCIQ TGCIGGTAaR TATTATCAKC 
CCATTAGQTA AATACTATTA CAGTOjTGGT TTCTGCa 



1 11 31 31 41 51 

MTDKTEKVAV DPETVFKRPR ECDSPSVQKR QRMALUUKKQ OAGDSLIAGS AMSKEKXLMT 60 

CaiAIPPSQLD SQIDDPTGPS KD(aO<iQKPGS KAPVGGHVTS HFSQDSLECR 6IASSPKSQQ 120 

BINADIKCQV VKEIRCUSQY EKIPEMLEGV QGPTAVRKRF FBSIIXEAAR CMRRDPVKHt 180 

KKKI.KRMI 

Seq ID KO: 303 DNA sequence 

Nucleic Acid Aeoeaaion 0: Bos sequence 

Coding sequences 247-815 

1 11 21 31 41 51 

AGTGTTCGGC TGGGACAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTTCC ATCCCOCTTQ 60 

GGCCAAACAG GATCGGTGCT TCTGGTGAGA CQTCTCCCCA TGCACATCAC TCCCAGATGC 130 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAQT GCCACCAQIG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG COGTCTCTCC TCCAG CAAGS 240 

GAAACAATGA CCGATAAAAC AGAGAAGaTG GCTQTAGATC CT6AAACTGT GTTTAAAGGT 300 

CCCAGGGAAT GTGACAOTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT QTTGGCAAGG 360 

AAACAASQAG CAGGAGACAQ CCTTATTGCA GGCTCTGCCA T6TCCAAAGC AAAGAGCTTA 420 

TGACAGGACA TGCTATTCCA CCCAGCCAAT TGGATTCTCA GATTGATGAC TTCACTGGTT 480 

TCAGCAAAGA TAGGATGATQ CAGAAACCTG GTAGCAATGC ACCTGTGGGA GGAAACGTTA 540 

CCAGCAGTTT CTCTGGAOAT GACCTAGAAT GCAGAGAAAC AGCCTCCTCT CC CAAAA GCC 600 

AACAAGAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA GGAACTCCGA TGCGTTQQAC 660 

AAAAATATGA AAAAATCTTC GAAATGCTTG AAGGAGTGCA AGGACCTACT GCAGTCAGGA 720 

AACGATTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATG TATGAGAOGA GACTTTGTTA 780 

AGCACCTTAA GAAGAAACTG AAACGTATGA TTTGAGAATA CTTGTCCCTO GAGGATTATC 840 

ACACCCCAAA TGCATAATCT CGTTAATGAT TGAGGAGAGA AAAGGATCAG ATTGCTQTTT 900 

TCTACAATGG AGCAGGATAT TGCTGAAGTC TCCTGGCATA TGTTACOGAA TCA ACTGG CC 960 

TTCCAGAGGC TAAOAAATTT CTGTTAGTAA AAGATOTTCT TTTTCCCAAA GCGTTTTATT 1020 

TGAAAGGATA ACTTCTGTTT TGGTTATTTT GTATTCCCAC CTGTGCTGGT AGATATTATT 1080 
AACCCATTAG GTAAATACTA TTACAGTCGT GGTTTCTGCA 
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11 



21 



31 



41 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



I I I I I I 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMRLLAH1«3 GAGDSLIAGS AMSKAKKLMT 
GKAIFFSQIiD SQIDOFTGFS KDRMMQKPGS HAPVGGNVTS SPSgnDTi K TR ETASSPEC8QQ 
EIHADIKRKL VKELRCVGQK YEKIPE»ILEa VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID NO: 305 DNA sequence 
Nucleic Acid Accession ft: Eos sequence 
I 87-689 



GAACAATAC». 
TTATGCCTTC 
TGTCAATTTT 
AGAlGAAATCT 
OAAGOATTAT 
CTCCCAAAAG GCCACCGTCT 



CCAGACTAGC 
AGATGTCCGC 
CAGAGGTCCC 
TGTCCGQGAA 



ATAATTTAAA 
A6TAT6AGAA 
CTGCTAAAOT 
AGGAGGAGGR 
TTAJSAGTAGG 
ATTAOaTTTA 
AATTGTGAGT 
AAOTTOTACA 
CTGTGCACTT 
ATTTGTAAGG 
TATCTATAGT 



TGACAGTGAA 
GGATOTTGCT 
TGCCCGGAAA 



21 
I 

GGCTGGGGAG 
GTCRGGATGG 
TTTGTGCAC3A 
GCGGAATTTT 
AAATTTGATG 
GGACCAGCTA 
GGATTCTTCC 
TCTATTGGAG 



31 

1 

CGCTGAGCCG 
CTAAAG6TGA 
CATQCAGACJA 
CCAAQAAOTG 
AAATGGCAAA 



OTATATAGTO 



ATTACAAAAT 
CGTTTACATG 
TATTTCCAAA 



TGGTGGTAAC 
TTGTAAAAAG 
TGTGGGGAAO 
CTOTTGACTC 
ACATAGCATT 



OACTATAAGT 
AAGGTGGAAQ 
GATGAATAAA 
AATTGACACA 
TTCATCACGA 



AOSTCX3CAAA 
ACATCACTAA 
CGAAAGGAAA 
AGGAAGATGA 
GAAACTGTTr 
TCTCTTATTT 
TCATATTGTA 



41 51 
I I 
CGCGTCGTGC CCTGCGCTGC 
CCCCAAGAAA CCAAAGGGCA 
AGAACATAAO AAGAAAAACC 
CTCTGAGAGS TGGAAGACGA 
GGCAGATAAA GTGCQCTATQ 
GAAGAASAAO GATCCTAATG 
AdAATTCCQC CXXaAOATCA 



GQCGGCAAAQ 
GTTTGATGGT 
AGAAGAGGAG 
ATCTGTCTCC 
GAGAAGTGTC 
GTCTCTCAAA 



CTGAAGGAGA 
GCAAAGGSTC 
GAGGAAGAAO 
TTGTGAATAC 
TGTTGCCCTC 
GTGCTCIA6A 




TATMAACTC 
CTCCTGTACT 
GAAATGTTTX 
A6TCAATTTC 
CTCCCTATAA 
TGAAGGAGAG 
ATGAAGTCTG 
AGGAAGGTGG 
CCTATTTTGT 
AAATTAAGGC 
ACATTATTTG 



ATCCATTTAG 
CTTAGCTGTG 
AQTTTTTAAA 
GAGTCACTGC 



TOAAeTTAAA TAAACACTAT TACATTTTTA 



TTn 



CTGCTGCXaVT 
TAAGTGCGQT 
GCAAAGCAAA 



ATGTasrAGC 
GQCTACTTGA 
GAGGAGTTAG 
GTGATTAGQA 
GGGGCCAAAT 



CAGTGAACAA 
TTCTTTTATT 
AGCTACTGTG 
GAGAACGACA 
CTGAGGCTAT 
GCAT TGCT AA 
TTCTCTTTCA 
CATTTGGGOT 



TTGGAAACAC 
AGGGCAGGCI 
GCCTGCTCAT 
GAAGGAGCTT 
GTTGGGGTGA 
CTGATGTGTA 
OTGAGTGTTG 
CGCAGGAGTG 
GTTGAGAAAC 
TAOGAGTTAT 
CATCAOAACT 



CAAACACCCC 
AATGGAATCA 
AAGTTTTVeCT 
GGTTTGTGTQ 
GGGGAGATGQ 



CTATTGCCCA 



GCCATGTiAJ'i' 
CTAGGCCAAG 
GGTCAAAAGG 



TTQCATQTCT 
GGTCACGGTC 
GTTTGTCCTG 
TGTGAGQTTT 
TGTCACTTGG 
ATTOGGOAGC 



AAGGAAGATG 
ACCATTTCTG 
CATTCACTGG 
TCAaTGGTTA 
CCACAGTAGC 
TACTGTCCGT 
GCATTAATAT 
ATTAATTTTA 
OGAGGCGGTG 
ACACCCTGAT 
AATGTGTTCC 



ACCCCCACTC 
ACTCAGTGGC 
TGATTTTGTT 
TAGGCAAGGT 
CTAGGTTTAA 
ACAGCAATTT 
CaXTTACCCC 
CTTGAOCCTG 
TAGGGSACGG 



TGTCAAGTTG 
CACACTGTGG 
AAGTGGCATG 
TTCTTAAGAT 
CTACTCCCTC 



AGTTTAGCCA 

AGGCCAcrra 

CAAGATTGCT 
ATGTTACCTA 
GCCAACCTGT 
TAACCACCTC 



CATTCTAAOC 
TCTTGCCAGC 
TTAAOTGGTQ 
AAAGGGGTAA 



CTTCTAGTGd 



AGCACTAAAT 
AAATGTAGAT 
TATTAGTGC^ 
AAGTGGTGAC 
AGCAATGAAG 
TTGQQTQTGT 
AQAGAAAGCA 
TCCTCTCCGC 
CTCTTATGTG 
TCTAGTTCTA 



GQ06TCTATC 
CTTTTGTGCT 
GATTAGGCAA 
TGOGGCAGCA 
AATA ATGC CC 
TTAGCTTGAT 



ATCAAT6AAA 
TrnGXATOT 
AAACTCTTCT 
CATTGTATTT 
CAGCTCACTT 
TGTGTCTGAG 
TCAGCAOCCT 
CTTTTGTCCC 
CAGAGTGTAT 
CCGTGCTCCT 
CTGCTGGTCT 
TATCCTTTTT 
TCTTGGQCCA 
GTATCA!TGAA 
TOATGTTCAA 
TASTGTAACA 
ACTAAAtACC 



GGTCAGCTGG 
TTAAACAAAC 
GTTCAAGAAC 
TTAGAATGCT 
CTAXTATAAC 
G6AGACTGGC 
AGGOCTGAGA 
TCGCaiTTCAa 
TCC3«GGTAT 
ACCTCCACCC 
GGTGTGTCAA 
GGCACATATC 
CCTGGATGCC 
TTGCTCCTAC 
CCTGAGCTAT 
AAI3TTQAATG 
TCTTA AACTO 
TTTTATCCAG 



ACCCCATTCT T 



r CTGTAATAAA G 



ATGTTTOAGG 
GCTTTTTCTT 
CKTGTCGGGT 
TTCATAGCCA 
GAAAATGACC 
GAATTICCCC 
G d TC C CAAT 
AT6GAAGAQA 
TCATAACTAO 
TCCAAAGTAG 
ACATTGAGTT 
AGATCACTAC 
TAGTTTCTCr 
TTCTGG6CCC 
(XCAT TTAAA 
TTTATCCTTC 
TGTCTGTTCT 



CTATGAAACA 
AAAATTCACT 
CCTGGATGAG 
TTOJCTCTCC 
ACIAATTTAA 
TCCIGCTTCA 



ATGTACCAAC 
CTAAOCAOAA 
TTAAAGAGGC 
TCAAOGTTTT 
GAQATGATOT 
ACTGTCTGTC 
AGGATAGTAC 
GGAAAGAACA 
TGTCACAAAT 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2B20 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 



75 
80 

85 



Seq 10 NO: 307 DNA sequence 
Nucleic Acid Accession Ss NM_022342 
Coding sequence: 1..2178 
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1 11 21 31 41 SI 

I I I I I I 

ATGGGTACTA GGAAAAAAGT TCATGCATTT GTCCGTGTCR AACCCACCGA TGACTTTGCT 60 

CATGAAATGA TCAGATAOGG AGATGACAAA AGAAGCaTTG ATATTCACTT AAAAAAAGAC 120 

ATTCGGAGAG GACSTTGTCAA TAACCAACAQ ACAQACTGGT OGTTTAAGTT GGATCGAGTT 180 

TTCACGATG CCTCCXavGGA CTTGGTTTAT GAGAC3VGTTG CAAAGGATGT GGTrTCTCRG 240 

CCCTCGATG GCTATAATGG CACCATCATG TGTTATGGGC AGACGGGAGC TGGCAAGACA 300 

ACACCATGA TGGGGGCAAC TGAGAATTAC AAGCACCGGG GGATCCTCOC TOOTGCCCTG 360 

AGCaMSGTTT TTAGGATGAT OGAAGRAOGC CCCaCACATG CCATCACTGT GCGTGTTTCC 420 

ACTTGGAAA TCTATAATGA GAGCKTtrrTT GATCTCCTGT CCACTCXGCC CTATGTTGGA 480 

CCTCAOTCA CACXAATGAC CATCGTGOAA AACCXTTCAAG GAGTCTTCAT TAAOGQCTTQ S40 

CAGTTCACC TCACAAQTCA GGAOQAGGAT GCATTCAGCC TCCTTrTTOA GGGTO AOAC C 600 

ACAGGATTA TAGCCTCCCA C3VCTATGAAC AAAAACTCTT CCAGATCACA CTGCATTTTC 660 

CCATCTACT TAGAGGCCCA TTCCCGGACC TTATCAGAGG AAAA6TACAT CACTTCCAAA 720 

TTAACrrGG TGQATCTGGC AGGCTCAGAG AGGCTGGGGA AGTCTGGGTC T6AGGGCCAA 780 

TCCTGAAGG AAGCCACCTA CATCAACAAA TCGCTCTCAT TCCTGGAGCA GGCCATCATT 840 

CXXrrTGGGG ACXaCAAGCXJ GGACCACATC CCCTTTCGGC AGTGCAAGCT CACCCAC3GCT 900 

TOAAOGACT OOTTAGGaaa AAACTGCAAT ATGGTCCTOG TGACAAACAT CTATGGAGAA 960 

CTGCCCAGI TAGAAGAAAC GCTATCTTOA CTQAGATTTa CCAGCAQOAT GAAOCTAGTC 1020 

CCACTGAOC CTGCCATCAA TGAAAAQTAT GKIGCTGAQA GAATOGTCAA GAACXTTGCAG 1080 

AGGAACTAQ CACTACTCAA QCAGQAGCTO GCTATCCATG ACAGCCTGAC CAACCGCACC 1140 

TTGTGACXrr ATGACCXXaT GGATGAAATC CAGATTGCTG AGATCAACTC CCAGGTOCXSG 1200 

GGTACCTGG AGGGGACACT GGA03AGATC GACATAATCA GCCTTAGACA GATCAAGGAG 1260 

TGTTCAACC AGTTCCGGGT GGTTCTGAGC CMCAGGAAC AGGAAGTGGA GTCCACXTTG 1320 

GCAGGAAGT ACACCOTCAT TGACAGGAAT GACTTTGCAG CCATTTCTGC TATCCAGAAO 1380 

CGGGGCTTG TGGATGTTGA TGGCCACCTA CTGGGTGAGC CTGAAGGACA AAACTTTGGA 1440 

TCGGAGTCG OXCTTTCTC TACC3\AACCT GGGAAGAAAO CCAAGTCCAA GAAGACATTC ISOO 

AA6A6CCAC TCAGGCXXSh CACCCCACCC TCXaAACCAG TGGCCTTTGA GGAGTTTAAQ 1S60 

AT6AGCAAG GTAGTCBVflAT CAACOQAATT TTCAAAGAAA ACAAATCCAT CTTGAATGAA 1620 

G6AGOAAAA GGQCCAGCQA GACCACACAG CACATCAATG CCATCAAGCG GGAGATTGAT 1680 

TGACCAAGG AGGCCCTOAA TTTCCAGAAG TC»CTACGGG AGAAGCAAGG CAAGTAOGAA 1740 

ACAAGGGGC TGATGATCAT COATGAGGAA GAATTCCTGC TGATCCTCAA GCTCAAAOAC 1800 

TCAAGAAGC AGTACCGCAG CX3AGTACCAG GACCTGCGTG ACCTCAGGOC TQAOATOaa 1860 

ATTGCCAGC ACCTAGTGGA TCAGTGTOGC CACOGCCTGC TCATGGAATT TGACATCTGG 1920 

ACAATGAGT CCTTTGTCAT CCCTGAGGAC ATGCAOATGG CACTGAAGCC AGOCaQCAGC 1980 

TCCGGCCAG GCATGQTCCC TGTGAACAGG ATTGTSTCTC TQGGAGAACa TGACCSMJOAC 2040 

AATTCAGO: AQCTGCAGCA GAGGGTGCTT CCKUSGGCC CTGATTCCAT CTCCTTCTAC 2100 

ATGCCAAAG TCAAGATAGA GCAGAAQCAT AATTACTTGA AAACCATGAT GGGCCTCC3U3 2160 
AGGCACATA GAAAATA6 



1 11 21 31 41 51 

I I I I I I 

MGTWaCVHAP VRVKPTDDPA HEMIRYGDDK RSIDIHIjKKD IHRGWKIIQQ TDWSFKLDGV 
LHDASQm.Vy ETVAKDWSQ ALDGVUGTIM CYGQTGAGKP YTMMGATENIf KHRGILPHAL 
QQVFRMIEBR PTHAITVRVS YLEimlESLP DUiSTIiPyVG PSVTPMTIVE NPQGVFIKGI. 
SVHLTSQEED AFSLLFBGET KRIIASKTMM KN8SKSHCIF TIYLEAHSRT LSEEKYITSK 
nUiVDIiAGSB RUSKSGSBOQ VLXEATYINR SIiSFIiBQAZI AUmQKBDHI PFRQCKLTHA 
t>KDSI.GGITC3T MVLVXtnXGB AAQIiBETLSS LEtFASRHKLV TTBPAIHEKY DAERKOTOniE 
XBTiATiTiKQWr- AlHDSIiTliRT FVTXDFMDSX QIAEIHSQVR RyLBOTUlBI DI IMiBO IKB 
VPNQPRWLS QQEQEVESTIi RRKmilDRN DFAAISAIQK KOSMBVOBBX, vaBPEGQNFG 
I.GVAPPSTKP GKKAKSKKXF EBPIiRFDTPF SKFVAFEBFK MBQOSBIIIRI FKENXSILNE 
RRKRASETTQ HINAIKREID VTKEALNFQK SUiEKQGKZE HK6LMIIDEE BFUiUJOiRD 
LKKQYRSEYQ DLROLRABIQ YOQHLVDQCR HSLLMBFDIH YNESFVIPEO ffQMALKFGGS 
ISPGMVPVNR IVSLGEDDQD KFSQIK3QRVL PBQPDSISFY HAXVXIEQKH KYIiKTI«l6LQ 
QAHRK 



1 11 21 31 41 51 

I I I ! I I 

TTTTTTTTTT TTTTTTTTAA TGCCTGCTQT CATGCTCTGT CTACCAGGGT GAATTTCCAA 
AAATTTCTGC ATAQCAATTT TAGCCAAAAC TATATATGTT CTGGGGAGGA TAGGCA TAGG 
CACATTGAAG ACCAAAGGAA AGAGTGAAGA AGXGTAGTTG GGTCATTGT6 AATGGATGTT 
TAGATTGTCA A6AAAAGTG6 GCCAGAGGCC CC3VCCTCACA CTAGGACGGC AATTGCCTCT 
CAXTAOIATC TCAQQCACCA TGOSTCTTAT TTGGTGTCAT AAGAAACACC CTCAACAAAQ 
TAATOAAOeX; TCAQCCTOCA GCTTCTCTTC TTCGGGATTC TTCTTAGGGC CTCCTTTTTC 
CTTTTATGTT TOCAGTACCC TGAATTTCTT ATTCCCATCC CCdATTAAAA TCTGCTTCAA 
AGAAAAAACA AGAAGGACAC ATTCACTTTA AGATCCAAAT GAAIGATAAG AGCTTAAAAC 
ATTATACTTA TCAGTATTAT TTGCATTTTT ATAGAAACCA AAACCATATT TCRACAAC 

Seq ID NO I 310 DNA Qequence 

Suclelc Acid Accession fti NM_0ia622.2 

CoOlng sequence a 1-1140 

1 11 21 31 41 SI 

I I I I I I 

ATGG06TGGC 6AGGCTGGGC GCAGAGAGGC TGGGGCTGCG GCCAGGOGTG 6GGTG0STCG 
GTGGGCGGCX: GCAGCTGCGA GGAGCTCACT GCX5GTCCTAA CXXXX3CC36CA GCTOCTCGGA 
CGCAGGTTTA ACTTCTTTAT TCRACAAAAA TGCGGATTCA OAAAAGCACC CA GGAAG GCT 
GAACCTC6AA GATCAGACCC AGGGACAAGT GGTGAAGCAT P 
PCTCCCTATC C 
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TATGAATCAC TGAAATCCAG 
GATAGCATAA GACCACAAAA 
AACCTAAGTG ATGGCXMCX; 
TGTTTATGGA GAGTACCTTC 
GCCTCAAAGG TCCTTTGTTC 
CACATGGCaG CRAATATGTA 
GGTCAAGAGC ACTTCATGGC 
TACCTGGGTA AAGTTGCCAC 
ATGACAGTCC TCGCAGCTQT 
CTTCCGATGT TCRCGTTCRC 
GCAGGAATGA TCCTGGGATG 
TTTGGAATAT GGTATGTTAC 
GTQAAAATCT 6GCATQAAAT 



PCT/US02/12476 



GGTCCAGAGT 
AGAAGGAGAC 
GACTGTGACA 
TCTGCAGCGG 
TCCAATGTTG 



TATTTTGATQ 
TTCAGAAAGG 
GGTATTATAG 
ACAATGATCA 
CTGTCAACAT 
AGCTTCTCTT 
TCTGCAGGTG 
GGACCATCAC 
ATCCCAGAAiQ 
GCCCTGAAAQ 
GATCATGCGG 

TTACGGTCAT GAACTGATTT GGAAGAACAQ OGAGOCQCIA lOSO 



AGTGTACCTA 
AGGAAGATAT 
CTGCACTAAQ 



GTATAAAAGC 
AGATTAACAA 
CTGCAAATGT 
GATATTTCAC 
TCAGTCSVCTT 
CCAGCATAGT 
TTATTTCCAA 
TTQGTGCATC 



T6ATT6GTT6 
GTGGTGGAAT 
CCTTGTATTC 
ATOGAATCCA 
CTCCTTATTT 
GAACATTCTQ 



TGGTGCCATC 840 



21 



31 



lASGTGQ CTCTAAGTAA 



51 



I I I I I . 

MAMRQHAQRQ WGCGQAWGA3 VGGRSCEELT AVLTPPQLLG RRFNFPIQQK CGFRKAPRKV 
BPRRSDPGTS QEAYICRSALI PPVBETVFYP SPYPIRSIiIK PLFFTVGFTO CAFGSAAIWQ 
YBSLKSRVQa YFDOXXADNI. DSIIIPQKBGD FBKEIHKNHH NLSOGQSTVT aiIA»IVI.VP 
CLNRVFSLQR nUKYFTSNP ASKVLCSPNI. I.8TFSHFSI.P aMAANKXVLN SPSSSIVNII. 
GQBQFMAVYI. SAGVISHFV3 YliORVATGRy 6PSLGASGAI MTVIAAVCTK IPEGRIAIIF 
LPMFTFTAQN AIiKAIIAKDT A^ILGHKFF DKAAHLGGAL POIWYVTVGR EIiIWKiniEPL 
VKIWHEIRTN GPKKGGGSK 

Seq ID NO: 312 DHA sequence 
Nucleic Acid Accession «: HM_000635 
Coding se<^ence: 195.. 3656 
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\ CCTTTGATGA GGGOACTOGQ 
: TTCCTGGTTT GACTGTCCTT 



I i 
■AOA CAGTCCCSAA OTTCTCaVAGG 
ACCCCSGG6A GGCAGTGCAG CCAGCT6CAA 



ATGCAATGAA TOGGGAAAAA 
GACACAGGAT 
CCTOGTGGAG 
GTCCTCCCCA 
ACTTCACCAT 
TATGACTOCC 



CAACXICCATT 
TCCAAQACRC 
TGGGGTCCAT 
CAGATGAQCT 



CAGTAACCTA CCAACTGACG 
AT0CCCCACX3 CTGCATTGGG 
GCTGITCCAC TGCCOGGGAA 
ACAATGGCAA CATCAGGTC6 
ACTTCCGGGT GTGQAATGCT 
GCATCAGA86 GGAOCCTGCC 
AGCCCAAGTA OGGOKCTTC 
CIGAGCTCTT CX»AATCCCA 
AOGASXGSn TCGG6AACTS 



TGTCCTTGQA 
6GCATCAACA 
GACCTTCAGT 
ACGGGAAAGA 
CX3GCAT6TGA 
AAGQCCAAAO 



GOAOATSAOC 
AGGATCCAGT 
ATGTTTGAAC 
GCCATCACCG 
CAGCTCATCC 
AA03TGGAAT 
GATGTGGTCC 
CCTGACCTTC 
GftSCIAAAST 



AATTTCTQTT CAAGACCAAA TTCCACCAGT 
ACAATQTGGA OAAAGCCCCC TGTGCCACCT 
ATCACAACCT CAGCAAGCAG CAGAATGAGT 
AGTCTCCAGA ATCTCTOGTC AAGCTGGATG 
GGATCAAAAA CTGGGGCAGC GGGATGACTT 
GGATTTTAAC TTGCAGGTCC AAATCTTGCC 
CCAGAGGACC CAQGGACAAG CCTACCCCTC 
TTGTCAAOCA ATA3TAC3GGC TtXXH'CAAAG 



GGTCaUVCCT 
ACATCTGCa«3 
TGTTCXXrCCA 
GCTATGCTGG 
TCaCTCAGCT 
OCCTGGTCCT 
TGCTTQAGGT 
GGTACGCCCT 



ACACGTGCGT 
GCGGAGTGAT 
CTACOWSATG 
GTGCATCGAC 
GCAGGCCAAT 
GGCCATGGAA 



CTGGaCTGGA 
GGCOGTGACC 
CATCCCAAAT 
GCCAACATGC 



OCAGOAGAAT 
TTGAQATCAA 
ACCACTOSGC 



CATTGCTGTG 
TGCAGAATCC 
AOACTGGATT 
QATGCTOAAC 
TQTCTGGCAG 
CAAAGCTGl 



TTCTOTOATa 

m:gcacaagc 
ctccatagtt 
ttcatgaagt 
tggctggtcc 
tacotcctgt 
gacgagaagc 



GGCTGAGCTQ 
GAGACTGCCC 
ACAACAAATT 
CCTTTGCTCA 
TGGGAGAAGG 
CCTTC3UW5GC 
AQCTCTACAC 
AGCCTTTGGA 
GGCTCAAATC 
AACTCTCCTQ 
GCXXaVGGCAA 
ChCCCCRCCA 
ACAAGAGGCT 
CACCCCCAAC 
QACAGAGGCT 
GCCCCACATT 
TGCTTTCCCA 
ACACQCCCAC 
AGGGTCCCCT 



CACC»TCCTC 
CTTATTCAGC 
CCTGGAGOAG 
TGGCAATGGA 
CAGGTAOGCT 
TGACSVTTQAT 
GQATGAGCTC 
AGCCTGTGAG 
CTCCAATGTG 
CCTCAGCAAA 



TGAGQATGGC 
CCAGCCGGCC 
GGCAGTGCGC 
GCCCCCCTGC 
CCAGCTGCTG 
GQAGGCCCTG 



TTTQCGACAG 
TGTGCCTTCA 
GAACGGCTGC 
GAGAAACTQA 
GTGTTTGGCC 
CAOAAGCrGT 
AGTGGGCAGG 
AOGTTTGATG 
ACCTGGGACC 
GCCCTCAGCA 
CTACAAAGTC 
CAAGGCCT6A 
CTGGTCCAA6 



ACATGCAGAA 
CTCCCATGTC 
OCCCTTTCTA 
GGAGACCCAA 
GTATGCIGAT 
AGACAGGAAA 
ACXXXXAGGT 
TGTTGGTGGT 
AGAAATOGCT 
TCGGCTCCAG 
CCCACCTGGQ 



CAACATCCIG 
CTGGAAAGAC 
GAATGTGACC 
TGAATACCGG 
TGGGAGCATC 
CTACTATCAG 
GAGAAGAQAG 
GCQCAAGACA 
ATCAGAGGCG 
TGTCTGCATG 
GACCAGTAGG 
CTTCATGCTQ 



QAaOAAQTGG 
CAGGCTGTCG 
ATCATGGACC 
TCCCGTGGGG 
ACCXXCGTGT 
GTAGAGGCCT 
ATTCCATTGA 
ATGGCGTCCC 
CTGGCCTGGG 
GATAAGTACA 
TTTGOCAATQ 



GGCCTCTOVQ CTCACCCCGA 



TCCGAGGCAA ACAGCACATT CAGATCCCCA 
CGCACCACTA CAGGCTCGTO CAGGACTCAC 
GCATGCATGC CAAGAACSTO TTCACCATOA 
OGACATOCAG CCGTGCC31CC ATOCTGGTGG 
ACTACCTGCC GGGGGAGCAC CTTGGGJJTTT 
GCATCCTGGA GCGAGTGGTG GATGOCCCCA 



TGACTCAQCC 
CTCCRAAAGC 
TGCCAGCCCT 
CTAGAGGAGT 
CTGAAGCCCA 
CTGACTGTGG 
GTCIGCAGCA 



TGGCCCAGGT GGCCACAGAA GAGCCTQAGA 
CAGAGTACAG CAAGTGGAAG TTCACCAACA 
TCCCGTCCCT GCGGGTGTCT GCTGGCTTCC 
GGTTCTACTC CATCAGCTCC CXXXX3GGATC 
COGTGGTCAC CTACCACACC CGAGATGGCC 
CATGQCTCAA CAGCCTGAAG CCCCAAGACC 
GCTTCCACCT OCCCGASQAT CCCTCCCATC 



1020 
1080 
1140 

1260 
1320 
1380 
1440 

1560 
1620 
1680 
1740 

laoo 

1860 
1920 
1980 
2040 

aioo 

2160 
2220 
2280 
2340 
2400 
2460 
2530 
2580 
3640 
2700 
2760 
2820 
2880 
2940 
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CTTGCATCCT CATCGGGCCT GQCRC3M3GCA TCSSOOCCCTT OaOC&tSTTTC TG GCABCR AC 3180 

GGCTCCATGA CTCCCSVGCAC AAGCXSkSTGC GOOGKSGCOQ CATGACCITG OTQTTTGGGT 3240 

aCCGCCOCCC AGA.TCAGGAC CACATCTACC ftBGAGOftOAT dCTOGAGBTG GCCXftGAAGQ 3300 

GGGTGCKSCA TGCGGTGCAC ACSkGCXTTATT CCCGCCTGCC TCGCAAGCCE AAGGTCTATG 3360 

TTCa^GGACRT CCTGCGGCAG CAGCTGGCXai GCX3AGGTGCT aXTTCTGCTC CACAAGGAGC 3420 

CAGGCX»CCT CTATGTTTGC GGGGATGTGC GCATGGCCCG GGACGTGGCC CACRCCCTGA 3480 

AGCAGCTGGT GGCTGCCAAG CTGAAATTGA ATGAGGAGCA GGTCGAGGAC TATTTCTTTC 3540 

AGCTCAAGAG CCAGAAGOGC TATCAOSAAG ATATCTTTGG TGCTGTATTT CCTTACGAGG 3600 

CGAAGAAOGA CAGGGTGGCG GTGCAGCCCA GCAGCCTGGA GATGTCAGCG CTCTGAGGGC 3660 

CTACAGOAGQ GGTTAAAGCT GCOGGCACAG AACTTAAGGA TGOAJSCCAGC TCTGCATTAT 3720 

CTGAGGTCAC ASGGCCTGGO GAGATGGAGG AAAOTGATAT CCCXXMCCT CAAGTCTTAT 3780 

TTCCTCAACG TTGCTCCCCA TCAAGOSCTT TACTTGACCT CCTAACRAOT AGCACCCTGO 3840 
ATTGATGGQA GCCTC 

Seq ID NO: 313 Protein sequence: 
Protein Acceasioa t: NP_oao€l6 



MACPHXFLFK TKFBQyAMNG EKGINNMVEK APCATSSFVT QUDLOYHNIiS KQQHGSPQPIi 60 

VECaKKSPES LVKLOATPIiS SFRHVRIKNW GSGMTFQDTL HHKAXGILTC RSKSCLGSIM 120 

TPXSI>TR6PR DXPTPPDELL PQAIBFVNQY YGSUCEAKIB EHLARVEAVT KEIETTVTYQ 180 

tlCTBLIPAT KQAVniNAPRC IGRIQWSNLQ VPDAESCSTA REMPEHICRH VRYSTNNOII 240 

RSAITVPPQR SDGKHDPHVW HAQIiIRYAGY QMPDGSIRGD PAHVBFTQLC IDLGWKPKYG 300 

RPDWPLVIiQ ANGROPELFE IPPDLVIiEVA MEHPKYBWPR EIiBLKWYAIjP AVANMLLEVG 360 

GLEPPGCPPN GWYMOTBIGV RDFCDVQRYN ILBEVQRRMO lOTHKLASLW KDQAWEINI 420 

AViaSFQKQN VTIMDHBSAA BSFMKYMQMB YRSRGGCPAD HIWLVPPMSG SITPVFHQEM 480 

LNYVLSPFYY YQVBAMKIHV WQDEKRItPKR HBIPIJOTLVK AVIiFACMLMR KTMASRVRVT 540 

ILFATBtQKS EALAWDLGAL FSCAmPKW CMDmRI^CL EEERUJ>VVT STFGtlGOCPG 600 

NQSXLKRSIiF MMCBUmKPR YAVFGLOSSM YPRFCAPABD IDQXLSHI.GA SQLTPKGEGD 660 

ELSGQEDAFR SWAVQTFKAA CBTFDVRQKQ HIQIPKLYTS NVTWDPHHYR LVQDSQPLDIj 720 

■ SKALSSMHAK NVPTMRLKSR QNLQSPTSSR ATILVELSCE DGOGLNYLPa EHI/3VCPGNQ 780 

PALVQGILER WDGPTPHQA VHI.EALDESG SYWVSDKRLP PCSLSQALTY PLDITTPPTO 840 

LLIiQKLAQVA TEBPBRQRM; ALCQPSEYSK WKFTHSPTFI. EVI.EEFPSLR VSAGFLLSQL 900 

PILKPRPYSr S3PRDHTPTE IHLTVAWTY HTRDGQGPLH HGVCSTWLNS LKPQDPVPCF 960 

VRHAS6FHLP EDPSHPCIIiI GPGTQIAPPR SFHCXJRMDS QHKaWHGQRM TLVPOCRRPD 1020 

EDHIYQEEML EMRQKGVLHA VHTAYSRLPG KPKVYVQDIL RQQIiASEVLR VLHKBPOILY 1080 

VOaSVRMARD VAHTUECQIiVA AKI.KUIEEQV EDYPPQLKSQ KRYBEDIFGA VPPYEAKKDR 1140 



Seq ID NO: 314 1 
Nucleic Acid Accession « i XM_0872S4 
Coding sequence: 47.. 2332 



1 11 21 31 41 51 

I I I I I I 

AGAGTACSTG TTTACAOATA AAACTGGTAC ACTGACAGAA AATGAGATGC AGTTTOSGGA 60 

ATGTTCAATT AATGGCATGA AATACCAAQA AATTAATGOT AGACTTQTAC CCGAAGGACC 120 

AACSW:CAGAC TCTTCAGAAG GAAACTTATC TTATCTTAGT AGTTTATCCC ATCTTAACAA 180 

CTTATCCC31T CTTACSUiCCA GTTCCTCTTT C3M3AACCAGr CCTGAAAA1G AAACTOAACT 240 

AATTAAAGAA CATGATCTCT TCTTTAAAGC AOTCAGTCTC TOTCRCACTO TACAGATTAG 300 

CAATGTTCAA ACTGACTGCA CTGOTGATGG TCCCTGGCAA TCCAACCTGG CACCATCGCA 360 

GTTGQACTAC TATGCATCTT CACCAGATGA AAAGGCTCTA OTAGAAGCTG CTGCAAGGAT 420 

TCGTATTGTG TTTATTGGCSV ATTCTGAAGA AACTATGGAG GTTAAAACTC TTGGAAAACT 480 

GGAACGGTAC AAACTGCTTC ATATTCTGGA ATTXGATTCA GATOGTAGGA GAATGAGTGT 540 

AATTGTTC3U3 GCACCTTCSWJ aTOAOAAGTr ATTATTTGCT AAAOGAGCTG AGTCATCAAT 600 

TCTCCCTAAA TGTATAGGTQ GAGAAATAQA AAAAACCAGA ATTCATGTAG ATGAATTTGC 660 

TTTGAAAGGG CTAAGAACTC TGTGTATAGC ATRtAQAAAA TTTACATCAA AASAOXATGA 720 

GGAAATAGAT AAACGCATAT TTGAAGCCAO GACIGCCTTG CAGCAGCGOa AAQAOAAATT 780 

GGCAGCTGTT TTCCaGTTCA TAGAGAAASA CCTGATATTA CTTGGAGCCA CaGCAGTAGA 840 

AGACAOACTA CAAGATAAAG TTOGAGAAAC TATTGAAGCA TTGBGAATGG CTOGTATCAA 900 

AGTATGGGTA CTTACTGGGG ATAAACATGA AACAGCTQTT AGTGTGAGTT TATCATGTGG 960 

CCATTTTCAT AGAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAGCGAGTQ 1020 

TGCTOAACAA TTGAGGCAGC TTGCCAGAAG AATTACAGAG GATCATGTGA TTC3\GCATGG 1080 

GCTGGTAGTG GA-KKSQACXA GCCTATCTCT TGCaCTCAGG GAGCATQAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTGTT CAGCTGTATT ATGCTOTOOT ATGGCTCX»C TGCAGAAAGC 1200 

AAAAGTAATA AGACTAATAA AAATATCaiCC TOAGAAACCT ATAACATTGG CXGTTGGIGA 1260 

TCGTGCTAAT GACQIAAGCA TGATACAAGA AGOCOVTGTT G6CATAGQAA TCATGGQTAA 1320 

AGAAC3GAAGA CAGGCTGCAA GAAACAGTOA CTATGCAATA GCKAGATTIA AGTrPCCICTC 1380 

CAAATTGCTT TTTGTTCAXG OTCATTTTTA TTATATTAGA ATAGCTACCC TTGTACAGTA 1440 

TTTTTTTTAT AAGAATGTGT GCTTTATCAC ACCCCAGTTT TTATATCAGT TCTACTQTTT 1500 

6TTTTCTCAG GAAAaVTTGT ATGftCAGOGT GTACCTGACT TTATACAATA TTTGTTrTAC 1560 

TTCCCTACCT ATTCTGATAT ATAGTCTTTT GGAACAQCAT GTAGACCCTC ATGTGTTACA 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACCXJC CTCTTAAGTA TTAAAACATT 1680 

TCTTTATTGG ACCATCCTGG GCTTCAGTCR TGCCTTTATT TTCrrTTTTG GATCCTATTT 1740 

ACTAATAGGG AAA6ATACAT CTCTGCrTGG AAATG6CCAG ATGTTTGGAA ACTGGACATT 1800 

TGGCACTTTG GTCTTCACAa TCATGGTTAT TACAGTCACA GTAAAGAIGG CTCT GBAAA C 1860 

TCATTTrroO ACTTGGATCA ACCATCTOGT TAOC TGGO GA TCTATTATAT TTTATTTTGT 1920 

ATTTTCCTTG TTTTATOGAG GGATTCTCTO GCCATTTTTO GGCTCtXSGA ATATGTATTT 1980 

TOKSTTTATT CAGCTCCTCT caUkGTGGTTC TGCTTGGTTT GOCATAATCC TCATGGTTGT 2040 

TACATQTCTA TTTCTTQATA TCATAAAQAA GaTCTTTGAC C6ACACCTCC M:CCTACAAQ 2100 

TACTGAAAAG GCACAGCTTA CTQAAACAAA TGOVGGTATC AAGTGCTTGG ACTCCATGTG 2160 

CTGTTTCXXB GAAGGAQAAG CAGOGTGTGC ATCT6TTGGA AGAATGCTGG AAOGA GTTAT 2220 

JM5GAAGATGT AGTCCAACCC ACATCAGCAG ATCATGQAQT OCATCGGATC CTTTCTATAC 2280 

CAAOGACAGG AQCATCTTGA CXCTCTCCAC AATGGACTCA TCTACTTGTT AAAGGGGCAG 2340 
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TAGTACTTTG TGGGAGCCAG 
ATGGCXaCAC TAGCTCTGAA 
GAGTTATAAT GGCAAACAAA 
TGAATCTGAA CATGTTAAAA 
TOTCCCTTGT GCTTATCGGA 
TTTAATRTAA ATGTAGAAAA 
TTGATTATTG ACTCTTCTAT 
AGAACTCTAT TTTTTTATTA 
ATACTGAGGA ATTTTGGTCC 
TTCACAGAGC AAATTAGGAG 
ATTTATACCA ATTCCTCTAA 
CAAGGGTATA TCATATATAC 
GTTTACTAAT ATTTTTGTGA 
TAGCATATTA TTAATTTAAt 
ACATATTTAA ATTTGCTTTT 
CATCGTTGTA CAGTTTAACT 
TATGTTTAAT TATACAAATC 
CTCTTTCTGC AGCCGACTTA 
AGGGTTTCAG TTAATAATCT 
CAATACAGTG AGTTCTGCCA 
TGTAAAAATG CTCAACTTGT 
TAATCGGGTA CATGTTACTG 
AATTTATCAA GTAGTTCAGT 
TTAAGATTTA GAAGTQATTA 
TT6TATACAT ATTAAGATAA 
AATTCCTTTA TGGAGATTTA 
GGCTTCTAGA ATrCGACTGG 
CTTGTTTGCT TGTAAACTAT 
TAACQATATT AAGTTATTAA 
OATATTTCAT AGCTGGATTT 
CAACGTACAA TQTCTQCATT 
CTCACTAGAG TACTAGGTGQ 
TGCATATAAC AAAATOACAC 
GCCATCAAAT AAACTGAGTA 
TOACCAACTG CASCAAOACA 
ATGTAATTTT CTQTACTCAC 
AAGTAAATGG CAACXyVCTAG 
TTAAAGCAAA ATTATCTTGT 
CAAXGCAGTC TGCAAGCTTT 
ACTACGTAAC CAGTAATCAC 
CTTTGOAAAG TATGATGTTG 
OCTAAATACG TTATTOCTAA 



TTCACCTCXrr 
ATTAATTTCC 
CAGAAAGCAT 
TTTGAGAATA 
CTCCTAATGG 
AAGAGAGAAA 
TTAAATCIGC 
GAGTTATATT 
CTCRGTGACC 
AATCATTTCC 
CTGTACTGTA 
AAATCAGGAA 
CAGAGTATAA 
GTCTTTATCA 
TTTCTCTTTA 
ATATCAATAA 
AGAATAOTAT 
GACATGCTCT 



TTCXTTAAAAT 
AAAATCTTTG 
TAGTACAAGC 
AAGAGACATT 
CATTTCAGTC 
TCTTAGTAAA 
TTCTSTAAAT 
TAAAGCTTTT 
TGTGTTGTTA 
AACCATTATI 
ACACAGCCTG 
TCAGGTCCGT 
AGACCCTATA 
•rrGGATCTTT 
CCTGAAGGCT 



TC3«3TGTGAT CACCXrTGTTA 2400 
TAGTAGTTCA 
CCCTCCCAAC 
TTTCATCTCT 



PCTAJS02/12476 



GAQTAITTTT 
TATGCTGAAA 
CATGGGAAAA 
ATTCATTAAT 
TACTGCAGTA 
TAAAGTTAGC 
TCACXX5AACT 
GTGGGTAAAT 
TGCATGCTTT 
CTGTGTATAO 



TACCCACTCA 
ACCXTTAATT 
TTGTCTGGTT 
GCC3VTTATAT 
TAGTATTAGC 



GTGTCCCAGT 
ATCAGGTAAT 
TAATTAACTC 
ATTGTCATTT 
TTAGCTTGAG 



OG OTAATT AA 
TCCCTTTCUV 
TTATGTCATC 



CATATAAATG 
TCAAATTGAT 
TAGATACTAT 
AATCTGGTTA 
TATTTCaTGA 
ATATTGCAAA 



TAAGCTAGAT 



Ca gQGG AAAG 
TATTTTCTTG 
GCTAAATATT 
AGGAAGATCT 
C»CTAATTCA 



CCAQTAGQCC 



TGTGCTCATC 
GATTTTAAGA 
CAGTAGTTTT 
AAGGAAAGTG 



TCaOTGOTCT 



CQGTTTTACC 
CTAAGCTTCC 
AATGGTAQAG 
CTAATGTAAC 
AATTTTCAAA 
GTTATTCTGG 
TGTTCCAGAA 
AAATTTGCTC 
TGCATTACAT 
ACAAAGACTC 
TOGCXriATAA 
TAGTTAAGQA 
CTGAACTGTT 

AAAGAGrrrr 

CTAGTGCTAT 
TCCCCmGC 
TACCCTTATC 
CAAAT03ATT 



nTCAGGTGT 
AATTAAATGC 
AAAACCTAAC 
TATTGAAAAQ 
CAG CTCTA AG 
AASTTTTCCC 
TTCCCATTTC 
ACAGAAATTA 
ATTTGTCTGT 
AATAGTCCTT 
AACTACTAAA 
GAGGAAATAA 
ATAAAATCTC 
TTACATOACC 
CAAAQTCATA 



GAACTTTATT 
ACTCCAAATC 
CTATTTATTT 
ATTCATCCTG 
ATATTTCTTT 
TGCCAAAACC 



TTTAOAATAA 
CAAACTACCA 
GGCTGTGGAA 
TAAGAATGAT 
TTCCATCCTG 
TAATQTTGTC 
CAAATAATGA 
rCQAAAATOT 
ATOAATATAA 
AGACTTTATC 
TCCAGTGATG 
CTTTAACTTA 
AAGAATAATA 
TGAAGATATA 
TTATAAAAGG 
GTGmATTT 
AAATAOCXTTA 
AGTGTGATT6 
TTTraAAAAA 
C3VCTCCGTTT 
AAGAAAGTAA 
TAAAACTCTT 
AAAATTCTTT 
AGAGCAAAAT 



2520 
2580 
2640 
2700 
2760 

2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3S40 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 



GAGGGCTGTA AaCCTGAAGA TAGTGGCAAG CAOCAAGTCA GTTrCXSlAAA TTGCCCCTCA 
GCTGCITTAA GT6ACTCAGC ACC 



AGGGGAOAAT 
ATACACAAAG 
CAGABGCAAA 
AQGTCATTTC 
ATAGGCAGGA 
TGCTACTCTG 

CAAAOAOTTA 
CTATTTTGAA 
TQTGAAATAA 
AATCATGGTA 
GGTTTAAAAT 
TGCaTTGTCA 
GAAGTTATAT 



ATGGGCCAGG 
TTATGGTCTG 
GCTTCC3VGAC 
TCAACCCTAG 
TACTGGAAAA 
GTCAAGGCCA 
AAAAATCTOG 
AAOATCAGTT 
OSTAAAACAT 
ATGAGTTATC 
CrrGAATGTT 
ATTTAGATTT 
TTTGGACCTG 
GTAAATGTAG 
TTATCAAATA 



AGCTTC3U3CA 
GCTACGAA6A 
CTGTCCTCTT 



GAAATACTTG 
GATTGTGAGA 
CTAGAAAATT 
TQAAGGCIGT 



GTTTTATTAA 
TATTTTCATA 
GTTCCTATAA 
TTATGAGGAA 
AGACACTGTG 
TATATTATTG 
AAAACTTTCC 



CATTCTGCCC 
TTGAACTTAT 
GACAGTTAAS 
AGGAAAAGGG 
TATAAGCAGG 
TTTTOGTCCC 



CCTGCTGICG 
TCTTOSGTCT 
ATCCTGAACa 



CTGATCGCTT 
AGCCaUVAAGT 
AGAATCTTCC 
TATTAATAAA 



TGAGTATCTG 
GCTGTCTAAT 
TACAGCTACT 
TATAT 



ACTATTAAAG 
ATAACTCaTG 
GAAATATTGT 
GTAATCCTTT 
CATAATTTTT 



AGTTGAQAAA 
AAAGCTCATA 
QGAGACTAAA 
ACCAGGACTG 
GAGACTCCTA 
TTTTAAAATA 
ATGTTGGTGT 
AATAAC3VCAC 
CATTTTATTT 
TGCTGTTTTA 
ATATGTTTGT 
AGCAATACTT 
AAAAATTCTC 
TAAAGTTTAT 



I 

MQFRBCSmG 
NETEIilKKHll 
AAAKIGIVFI 
ABSSILPKCI 
RBEKLAAVFQ 



QPYCIFSQQT 
SIKTFLYnTI 
MALBTHFWTW 



I.ERVIGRCSP 



11 

1 

MKYQEINGRI. 
LFFKAVSLCH 
Gi3SSBTHBVlC 
GGEIERTRIH 
FIEia)LII.LG 
IWII.ELINQK 
CSAVI.CCRMA 
AKNSDYAIAR 
LTOSVYLTLY 
U3FSHAFIFF 
INHLVTWGSZ 
DIIKKVFDRH 
THISRSWSAS 



21 

1 

VPEX3PTPDSS 
TVQISNVQTD 
TLGKLERYKL 
VDEFALKGtiR 
ATAVEDRLQD 



41 



PI4KAKVIRI. 



HICFTSLPIL 
FGSYLIiIGKD 
IFYFVFSLPy 
LHPTSTBKAQ 



TU:iAYRKFT 
KVRETIBAIiR 
QLARSITEDH 
IKISPEKPIT 
BQHFYYIRIA 



GGILWPFLGS 
LTBTHAGIKC 
LTLSTMDSST 



Seg ID NO: 316 DNA sequence 
nucleic Acid Accession #: NM_004473 
Coding sequence i 661 . . 1791 



4S20 
4980 
5040 
SlOO 
5160 



5400 
5460 
5520 
5580 
5640 
5700 
S760 
5820 



SHI.10n.SHLT TSS5FRTSPB 
LAPSQI.EYYA SSPDEKAI.VB 
HRMSVIVQAP SGEKIiFAKG 
SKEYEEIDKR IPEARTALQQ 
MAGIKVWVLT GDKHETAVSV 
VIQHGLWDG TSI.SLAWEH 



TLVgyFFYKir vcfitpqfi.y 
FBVIi(»IKPTIi YBDISKHRL]:. 
GNHTFSTDVF TVMVITVTVK 
QNinFVFIQL LSSGSAVfFAI 
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CCCGCCTGGG CCCGOGCCCC GACCTCGCTG CCCCCGCCTC OCCTCTCTGC COtlFGGOGCT 120 

TACCGCKACC TTCK3CCTCGG GGGCACXSGCa TGC3GCGGCCC CCXXXSVOATC GCCCAGCGCC 180 

AGTACTAACT GCCCTCGCTC TGGCCTTCGA GCCXX3AAGCC TCTTCTGCOC GCACAACCTA 240 

GGtSVGTAATC CTAAACTAGC GGGCACCACA GACCAGCTGC AGCCACCXX^ ACCCAGGGAT 300 

CACTTCCGGA CCCCTCGACC GCCCGGCACC AGCGCGCAAG GGACCCTTCA GCXX3GAGACC 360 

AGAGTCCAGT CC0GGTCXXX3 AGGCCACCGC CGCTGCCCGC CTCGAGAAGC AC3UVCGCGGG 420 

CTQAGCGGTC GGCTAGCGGG TCACTCCC6A GCCTCTGTCT GCACCGCGCX: AGCCCCAGAC 480 

CACGGACGCT QAGCCTCCAG CGCGCGCCAG CCTGGGCCGC TGGGCTCTCX: GGGCCStfSCCC 540 

GCGAGGATCC CCTGAGCTCT CCGCAGAAGG GCCGAGCGTC CGTTCCGQQQ AOQCC3W3GCC 600 

CGCCCCXGCC CCCCGACAGC CGCCXXSGATC CAGA6CC0GG G6GTGCGGGA CGCCCGCGCC 660 

ATGACTGCCG AGAGCGGGCX: GCCGCCGCOG CAQCCQSAQa TGCTGGCTAC COTGAAGGAA 720 

GAGCQCGQCG AGACGGCAQC AGGGGCCGGG GTCCCAGGOG AGGCCACSQaQ CCJOOGGGOCa 780 

GGCGGGCGGC GCCGCAAGCG CCCCCTGCAG CGCGGGAAGC CGCCCTACAO CTACATCGCG 840 

CTCATCGCCA TGGCCATCGC GCACX3CQCCC GAGCGCCGCC TCACGCTGGG CGGCATCTAC 900 

AAGTTCATCA CCGAGCGCTT CCCCTTCTAC CGCGACAACC OCAAAAAGTG GCAGAACAGC 960 

ATCCX3CCACA ACCTCACACT CAACX5ACTGC TTCCTCAAGA TCCCGCGCGA GGCCGGCCGC 1020 

CCGGGTAAGG GCAACTACTC GGCGCTCGAC CCCAACGOGQ AOQACATGTT CGAGAGCGGC 1080 

AGCTTCCTGC GCCGCCaSCAA GCXSCTTCAAG CGCT03GACC TCTCCACCTA CCCGGCTTAC 1140 

ATGCACX3ACG CGGCQGCTQC CGCAGCCXJCC GCTGCOGCAO COGCC6CCX3C CGCOGCOGCC 1200 

GCGGCCATCT TCCCAGGCGC GGT6CCCX3CC GCGGGCCCCC CCTACCXXSGG OGCCGTCTAT 1260 

QCAGGCTACG OGCXX5COGTC GCTGGCOGCG CC6CCTCX3U3 TCTACTACCC OQOGGOGTCG 1320 

CCCGGCCCTT GCCGOGTCTT CGGCCTGGTT CCTGAQOGQC CGCTCAGCCC AGAGCTGGGG 1380 

CCGQCACXjaT CX3GGGCCCGG CGGCTCTTGC GCCTTTGCCT CCGCOGGCGC CCCCGCTACC 1440 

ACCACCX3GCT ACCAGCCCGC AGGCTGCACC GGGGCCCX3GC CGGCXaACCC CTCTGC CTAT 1500 

GCGGCTGCCT ACGCGGGCCC CGACGGOGCG TACCOGCAGG GCGCCGGCAG TGCGATCTTT 1560 

GCOGCTGCTG GCCGCCTCGC GGGACCCXSCT TCGCCCCCAG CGGGCGGCAG CAGTGGCGGC 1620 

OTGOAGACCA CGGTGGACTT CTACGGOCGC ACGTCGCCCG GCCAGTTCSG AGCGCTXJGGA 1680 

GCCTGCTACA ACCCtGGCGG GCAGCTOGSA GGGGCCASIG CAGiSOOCCTA CCATGCTCOC 1740 

• CATGCTGCCG CTTATCCCGG TGGGATAGAT CG6TTCGIGT CCBCCATSIG AGCCAG03TA 1800 

GGGACGAAAA CTCATAGACA CATCGGCTGT TCACACGTTC CCXJGCaACCT GAGAACQAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACCC ACGTGGAAAA GACOGAGCAG GCCaCAGAQG 1920 

CTCGGTCTCC CCGCGCACAG CGTAGGCACC CTGTGTACTC TGTAAACGGG AGGAOGTGGG 1980 

GCGAGGCSVSC CAGA6CCCTT GGACTGGC3VC AGGGACCCTC QATGGAGCGA AGCCCTCAAA 2040 

OGGGATGCTT TCTGGCATTC TATCGGGGAG GGTCCTTGGC GGTAACCAGA GGGCAGCGTA 2100 

GTGTCAAC!AC CAOAQACCAG GATCCAAATT GTGGGGAATC AGTTTCAGCC TTCCATGTGC 2160 

TGCCGGAACT CGGGCCTTTT TAOGCGOTTC GTCCTCTAGT GCCTTTAACT QCGTTACTAC 2220 

AATAAAAGGC TGGGGCAGCQ CCTTTCTTCT TAAAGT6AGG AGGAOUVATT TGCAAAAOAA 2280 

ATAGGcrrrr crrcrrTTTT aaattgoaoa aatctcxgct croGTTGACc tgsgctogtt 2340 

TTCCCTGTCT CTGAGAACTT OAGACCTAGC TCCGAOTTGA ACTGTQOSTC AGCACTCX3VG 2400 

TCCX3VTCACC TGAACCTTCA GTCTCCCCCA TCTGTTACAC TAOAGGOCCQ CSkOGACTCTA 2460 

TCCACCX3CCC CCGGGTTATC ATTCAGOGCC CCATCATCTT GGATOCTGCC CTGCGTATTT 2520 

GGCAGCAATG GTGGGCCACC CAGGGCCTCT GAGTAGCCAC CCAAAGCCTA GCCGCTGTTC 2580 

TAGGGAACX3G AAAAGAGTTC ATGGCCAAGC GTCTAACCTA AAGTCCCAGG ATTGGCTCCA 2640 

GGCAGCAATT ATATCATAAC TTATTQAACT TTTGAGCAGG ACGTGCTGGT AATTTCATGG 2700 

CTGTTACTGC CCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCGT 2760 

TCATTTOTCC ACTGTTTCT T GTCATCAOSC AGCCCFGGAC CCAAAGGGTG AACTAAAOTT 2820 

TAAGGAGA3G AGAGGATTCA AGOAGCCCST TGGTQAOGCC TTTCAGTAGC TGGGGAGGGC 2880 

TCTTCCATCC OCAGCACCCC CTGCTACACX: TCAGCAGCCT CCCCCATGCA AAA AGGA AAG 2940 

AGAAAAATTA AGTTAGGGCA OTCAGTAAAG TGAGCTTTAG AAAQAAACTG GAATTTTAAC 3000 

TTCATTTTGT ATCTTGCTTA AGTAGCAGGC TCACTAAAAT TAGAGAAAGT CCAATAACTC 3060 

TCCCCCTTTC CCTTGAGAAA TCTTTAAGTT TCGATTCTGG AGCAAAAACT TTCAGCATTA 3120 

AATATTTCAG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTCTTCAG ATGGACTGTT 3180 

TTAATAAAAA TCTTTGAGCA AGTGAGTTAT GGCAAGAGAA ACTCAGCCTC TTTCTGTATA 3240 

AACTTAACAO GGAAGGGCTG GGGTGTGAAA AAGAAGATTG TATGAAAACC ATTGGTAATT 3300 

TTTATTTTTT ATTTTTGGGA CTGCACTATC CTGTTCAajA AGACATGTGA ACTTGGTTCA 3360 

QTCCAAATGG GGATTTGTAT AAACCAGTGC TCTOCATTAG AAATATGGTG CAAGOaCAT 3420 

ATOTAATTrT AAAIATTCrA STAOCCACAT TAATAAAGTH AAAAQAAACA AAAAAAAAAA 3480 



FKHIAHZ80I DTSAMSCRZP TIOHPACTOR TTFMTAESGP PPPQPEVLAT VKEERGETAA 60 

GAGVFOBATG SGAGGRRRXR PUJRGKPPYS YIALIAMAIA HAPERHLTLG GIYKPITERP 120 

SFXKDMPKRM QHSIBBHIiTL NDCFLKIPRE AGRPGKGtJYW ALDPNAEDMF ESGSFIiRKRK 180 

RFKRSDLSTY PAVMHDAAAA AAAAAAAAAA AAAAAIFPGA VPAASPPYPG AVYAGYAPPS 240 

lAAPPPVYYP AASPGPCRVF GtVPERPLSP BLOPAPSGPG GSCAPASAGA PATTTGYQPA 300 

GCKSARPAMP SAXAAAYAGP D(».YPQGAQS AIFAAAQRLA QPASPPASQS SGaVETTVDF 360 
YGRTSPGQFG ALGACYHPGO QliGOASAfiAY HARHAAAYPG GIDSFVSAM 
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CCGGGCAGGT GGCTCATGCT CGGGAGCGTG QTTGAGCGGC TGGCGCQQTT GTCCIGGAGC 
AGGGGOGCAG GAATTCTGAT GTGAAACIAA CAGTCTGTGA GCCXTTCGAAC CTCOGCTCAG 
AGAAGATOAA GGATATCGAC ATAGGAAAAG AGTATATCAT OXCAGTCCT GGGTATAGAA 
GIGTGAGG6A QAQAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAQTTCA 
aaAOAACTCQ AG05ITGGAA TGCCaWkGATG CCTTGGAAAC AGCAGCCOGA GCCGAGGGCC 
TCTCICTTGA TGCCTCCATG CRTTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 
GAAAGTACCA TCATQGCTTG AGTGCTCTGA AGCCCA TCCG GACTACTTCC AAACACCAGC 
ACCCAGIQCA CAATGCTGGG Crm'VVOCt GXATGACTTT TTCGTGGCTT TCTTCTCTGG 
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CCajTGTGQC CCACAAGAAO 
ACXSAQTCTTC TOACGTGAIVC 
AAGTTGGGCC AGACXSCTGCT 
TCRTCCTGTC CRTCGTGTGC 
TCATGGTGAA ACaVCCTCTTC 
TGTTGTTAGT GCTGGGCCTC 
CTTGGGCATT GAATTACCGA 
TTAAGAAGAT CCTTAAQTTA 
TTTGCTCX»A CGATGQQCAQ 
GAGeACCOGT TGTTGCCATC 
GCTTCCTGGO ATCAGCTGTT 
TCACAGCATA TTTCAGGAGA 
ATC3AAGTTCT TACTTACaTT 
AGAGTGTTCA AAAAATCCGC 
ACGGTATCAC TGTGGGTGTG 
CTGTTCATAT GACCCTGGGC 
TCTTCAATTC CATGACTTTT 
AAGCCTCAGT GGCTGTTGAC 
TAAAGAACAA ACCAGCCAGT 



PCT/US02/12476 



: TAGAQAGACT 



CTGATGATCA 
GAGTATACCC 
CTCCTGACGG 
ACOSGTGTCC 
AAGAACATTA 
AGAATGTTTG 
TTAGGCATGA 
TTTATCCTCT 
AAATGCXyrGG 
AAATTTATCA 
GAGGAGGAGC 
GCTCCCATTG 
TTCX3ATCTGA 
GCTTTOAAAG 
AGATTTAAGR 
CCTCACATCA 
ATCCAGAACT 



OGCAGCTGGC 
AGGCAACAGA 
AAATCGTGCG 



TTTAC(X»GC 
COGCCAOGGA 
AAATGTAXGC 
GTCX3GATATT 
TGGTGGTGAT 
CAGCAGCACA 
TAACACCGTT 
GTTTGTTTCT 
AGATAGAGAT 
OQCCCRAGCT 



GTGOCAAGAA 
QATCTTCTGC 
TGGCTTCaGT 
GTCTAACCTG 
GTCTTGGTCG 
GGCCATCCTA 
CXrrGGGTGAG 
CGTTGGCAGC 
AATTATTCTG 
AATGATOTTT 
TGAACGTGTC 
CTGGGTCAAA 
GGAAAAAGCC 
TGCCAGCGTG 
GGCTTTCACA 
TTCAGTAAAG 
AATGGAAGAG 
GAAAAATGCC 



CIGTCCAAGC 
GAGCTOAATG 
GGCACCAQGC 



CAGTACaGCT 780 
CrTGCACTOA 840 
ACCATGGCAT 900 
CTCATCAACA 960 
CTGCTGGCTG 1020 
GGACCAACAG 1080 
1140 
1200 
1260 
1320 
1380 



CS^GAAGATGA 
GCATTTTCTC 
GGGTACTTCC 
GTGACCTTCT 
GTGGTGACAG 
TCCCTCTCAQ 



AGGCQGTQCT OGCAGAQCAa AAAOQCCACC TCCTCCIGGA CAGTGACGAO 
CCGAAGAGGA AGAAGGCAAQ CACATCCACC TGGGCCACCT GCGCTTACAG AGGACACTGC 
ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 
GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 
TTGCAATCAG TGGAACCTTC GCTTATOTGG CCX»GCAGGC CTGGATCCTC AATGCTACTC 
TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 
ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGOQACCTG RCGGAGATTG 



^ CAGGACXATC 1 



TTGTTACCCA 
GCT6TATTAC 
CCATTTTTAA 



GTTCAGTQCC 
TCCZGQTTAT 
OOTTQAaTTA 



CCAGTTACAG 
GGAAAGA6GC 
TAACCTGTTG 
TTCACAGAAG 
AGTAAAGCCA 
CTGGTCAGTA 



AQTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 
TACCTGGTTG ACTGTGATGA AGTGATCTTC ATOAAAGAGO 
ACCCATGAGO AACTGATGAA TTTAAATGGT GACTATGCTA 
CTGGGAGAGA CACC3GCCAGT TGAGATCAAT TCAAAAAAGG 
AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 
QAGGAAfiGGC AGCTTGtOCA GCTGOAAOAG AAAGGGCAGG 



CCCTCTCCAT 
GCACGCTGCG 
CTATGAAGTT 
TGQATGAAQT 



CTGTGCS3GCT 
TTATGCACGG 
TAACGGGGCT 
CGGTGGAGAG 
AGAACAAGGC 
AGATGAGGTA 
CTAAAGAGAA 
CCCTCTTCCG 
GTGATATTGO 
TGTTCAGTGG 
TTTGGGATGC 
TTGAATCTGA 
OCATA6CTAG 



TOACAGCATG AAGGACAATC 
GGCAGTCATG CTGATCCTGA 
AGCTTCCTCC CGGCTGCATG 
TTTTGACACG ACCCCCACAG 
TGACGTGCGG CTGCCGTTCC 
CTGTGTGGGA ATGATCGCAG 
CATCCTCTTT TCAGTCCIGC 



CTCATATGCA 
AAGCCATTOQ 
ACGAGCTTTT 
GGAGGATTCT 
AGGCCGAGAT 
GAGTCTTCCC 
ACATTOTCTC 
CTTTCCTCTC 



Z AGCATCTACX5 



CACCATCCAC 
T6ACAACCAA 
GGACCTCATC 
GCAJ3ATTCCC 
GTTCCAGTTT 
GATCAATCAC 

CCGAGAAAAC 



AGCATCGCCX: 



ACGGTCAGAC 
TACATTAAGA 
GACTGGCCCC 



TCTGGTGGAG 
CCTTQCCGAC 
CACTGTCAGA 



CTCCCTCTTG 
GTGGGGCGGA 
TTATCTGGAG 
CTCCGAAGCA 
TCAAATTTGG 



?AC 

TCATCACCAC 
CGGGTCTOGC 
TGGCaXCTGA 
CTCTGTCCTT 
AGGAGGGAGA 



GTTCATCXAG 
GTGGTTCCTT 
CAGGQTCCTG 
CCACATCACG 
GTTTCTGCAC 
GTGTQCGATG 
CAOGGGGCTG 
CATCTCTTAT 



CTTOGAAGCC 
TCCAAAGACR 
AACGTTATCC 
6TGGCAGTGG 
ATTCXXSGAGC 
TCCa«3CATAC 
AGATACC^GG 
CGQTGGCTGG 
ATGATCGTTC 
GCTGTCCAGT 
CXSATTCACCT 
GCCAGAATTA 



AlSISATGGJU} AATGGG6ATA 




AGAGACAOAC TTATTOATTC 
GACCATTGCC CATCGCCTGC 
GGGACAGGTG GTGGAGTTTG 
CTATGCCATG TTTaCTGCIG 



TCCTAAAQAA 
CAGGATCAGG 
GCTGCATCAA 
AACTCTCTAT 
ACCCCTTCAA 
AAGAAT6TAT 
ACTTCTCAGT 
AGATTCTGAT 



AGTATCCTTC 



GATTOATGGA 
CATTOCTCAA 
CXaUSTACACT 
TGCTCAGCTA 



ACX3ATCAAAC 
CTGGGGATGG 
GTGAGAATCA 



TCTATATATA 
TATTAAAATA 
TTGCTGTACT 
CTCTAGCTGQ 
ATAGTGQGCC 



CGTCCTCCTA 
OQCTTQTGTQ 
CATGZAAACA 
ATTATAATTG 
ATTCTGTACA 
AGCACTGTGC 
AGAGATCTGG 
TGGTTTCACG 
CTCCOACAGC 



CX33AAACCTT 
TTTCACTTTT 
AAATTTAGrTT 



ACAOGGTTCT AGGCTCCGAT AGGATTATGG 
ACACCCCATC GGTCCTTCTG TCCAACXaCA 
CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 
AGAGCATTGC CATTCCCTGC CTGGGGCGGG 
GCCTTTCTCG ATrTTATCTT TOGCACAGCA 
AGGOAaAGTC ATATTTTOAT TATTSTATTT 



TAGCCTATAT 
TAATAACAGT 
TTTTGCTATT 
GTGCCAGGTT 
CCCCTCTGCC 



CTATAATOAA GCITTAXAOS TOTAGCTATA 
TTACAGTGAA AATGTAAGCT CTTTATTTTA 
GCATATTCCT TTCTATCATT TTTOTACAGT 
AGACTGTAGG AAGAGTAGCA TTTCATTCTT 
TTCTGGGTGT CCAAAQGAAG ACGTGTCGCA 
GCCTCCCCAC AGCOGCTCCA GQGGTGGCTG 
AGCGCGQTGA GTTCTCAGGG CTCCTGCCTT 



TTTCACTOCC 
TTICCTGCCT 
TCCCACT6CC 
GTTGGTTCCA 
ATTCCX3VCAC 
CTCACXBCAG 



TCCATCAAGA ATGOOGATCA CAGAGACATT CCICCQAGCC GGGGAGITTC 



TCRGGTTCCT 



ACCTCAGGTT 



CTCCACAGTT 
TOGTCGCACA 
TAATCAGTGT 
GCTGGTTGCT 



GCTSTTOTTT CTAAACAAGA ATCAGTCI 
ATGGCTGGCC ACTCCAC3VGA GCTCTCCAGC 
CCAACTGCTG CTTTTTGAGQ TGGCACTTTT 
CAGTGGCAGG GCTCAGGATT TCGTGGGTCT 
GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG 
CTCACACTGa OGTAGAAGTT TTTGTACTGT 
GTGTGGTTTG GTGTGTTCCC GCAAACCCCC 
GCGTGOTCAC TGCTQTCATC ASTtQAATGG 



CCACAGAGAO 
TCCAAGACCT 
TCATTT QCCT 
GTTTTCCTTT 
CAACTTTAAG 
AAAGAGACCT 
TTTGTGCTGT 
TCAGCOTTGC 



ISOO 
1560 
1620 
1680 
1740 
1800 



2400 
2460 
2520 
2580 
2640 
2700 
27S0 
2820 
2880 
2940 



3540 
3600 
3660 
3720 
3780 
3840 
3 900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 



4800 
4860 
4920 
4960 
5040 
5100 
S160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
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ATGTCGTGAC CAACTAGACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 5760 
CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATrrTC TJUVAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 
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MXDIDIGKEY IIPSPGYRSV REKTSTSGTH RDREDSKPHH TRPLEC3QDMi BTAAHBEGLS 60 

LOASMHSQLR ILDBEKPKQK YHHGI.SAI.KP IRTTSKHQHP VDKAGLFSCM TFSiOiSSLAR 120 

VAHKKGBI.SM EDVWSLSKHE SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIPCRTRLI 180 

IiSrVCIMITQ LAGPSGPAFM ViCHI.I.EYTQA TBSMLOYSLL I,VLGI.I.I.TEI VHSWSLAL.TW 240 

AUreETGVRL RGAILTMAFK KILKLKNIKE KSLGELIHIC SNDGQRMFEA AAVGSLIAGG 300 

PWAILGMIY NVIIIiGPTGF U3SAVPILFY PAMMFASRI.T AYFRHKCVAA TDERVQKMNE 3 SO 

VLTYIKFIKM YAHVKAFSQS VQKIRBEEHR ILEKAGYFQG ITVGVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEBVHMIK 480 

NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEBV RQLQRTEHQA 540 

VLAEQKGHLI. LDSDERPSPE EBEGKHIHLG HUtI«RTI<IIS IDLEIQEGKIi VaiOQSVGSa 600 

KTSIiISAILG QMTLLEGSIA ISGTFAYVAQ QAHILNATLR DNILFGKEXD BBRTMSVUIS 660 

CCIiRPDLAII. PSSDLTEIGE RGANLSGCQH QRISIARALY SDRSIYILDD PI.SALDAHVG 720 

MHIFMSAIRK HI^SKTVLFV THQLQYLVDC DEVIFMKBGC ITEHGTHEEI. MSliNGDYATI 780 

PMIUJXSBTP PVEINSKKET SGSQKKSQDK GPKTGSVIOCE KAVKPEEGQI- VQLEEKGQGS 840 

VPMSVYGVYI CAAGGPLAPL VIMALFMLNV GSTAFSTMOT, SYWIKQGSGN TTVTROJETS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVKGT LHASSRLHDE LFRRIUISPM 960 

KFFDTTPTGR lUJRFSKDMD EVDVRIiPFQA EMFIQMVILV FFCVGMIAOV FPWFI.VAVGP 1020 

IiVIliFSVLHI VSRVblRBLK RUINITQSPF LSBITSSiQG lATIHAYNKG QEFLHRYQEL 1080 

UJDNQAPFFL FTCAMRWLAV RUSLISIAI.1 TTTOUIIVW HQQIPPAYAG lAISYAVQLT 1140 

GLFQFTVRLA SETEARFTSV ERIHHVIKTL SLBAPARIKN KAPSPDWPQE GEVTFEKRHI 1200 

RYHENLPLVL KKVSPTIKPK EKIGIVORTG SGKSSIXaOM. FW.VEI.SGGC IKIDGVRISD 1260 

lOIADiaiSia. SIIPQKPVLP SGTVRSNLDP ENQYTEDQIW DALBRTm«E CIAQIiPIMiE 1320 



1 11 21 31 41 51 

AGCAGTTGCA CaVACTTCX»G CAACTTTCTC AGCCGGCTAC TAATGAGCTG AAAGCCAGGA 60 

ACATCCGAGG AGAAGAGAAA GCTTCCAGCC CTCCTCCCTT CACCXTTCGAA ATCCAGACAC 120 

CCCCACCCCX: ACXXTTCAGAT CACTTTAAGA TAATTTCTTT ATTCGTTTGC CTGACAGACC 180 

ATOaCTCCCT TTGOAAGAAA CTTGCTAAAQ ACrCGGC»TA AAAACAGATC TCOUVCTAAA 240 

GACATGGATT CAGBAQAOAA GGAAATTGrO GTTTGGGTTT GCCAAQAAGA GAAGCTTGTC 300 

TGTGGGCTQA CTARACGCAC C3VCCTCT6CX OATGTCaTCX: AGGCTTTGCT TGAGGAACAT 360 

GAGGCTACGT TTraSAGAGAA ACGATTTCTT CTGGGGAAGC CCAGTGATTA CTGCATCATA 420 

GAGAAGTGGA GAGGCTCCGA AAGGGTTCTT CCTCCACTAA CTAGAATCCT GAAGC TTTGG 480 

AAAGCGTGGG GAGATGAGCA GCCCAATATG CAATTTGTTT TGGTTAAAGC AGATGCTTTT 540 

CTTCCAGTTC CTTTGTGGCS3 GACAGCTGAA GaWAATTAG TGCAAAACAC AGAAAAATTG 600 

TGGGAGCTCA GCOCAGCAAA CTACATGAAG ACTTTACCAC CAGATAAACA AAAAAOAATA 660 

OTCRGGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCaGG ACACAGTTTC TCAT6RTCQA 720 

GATAATATGG A6ACATTAGT TCATCTGATC ATTTCCXaGG ACCATACTAT TCATCAQCAA 780 

CTCRAGABftA TGAAAGM3CT G6ATCIGOAA ATTGAAAAGT GTCAAGCTAA QTTCCATCTT 840 

GATCGAOTAG AAAATGATOG AGAAAACTAT QTTCAGGATG CATATTTAAT GCCCAGTTTC 900 

AGTGARGTTG AGCSkAAATCT AQACTTGCAG TATGAGGAAA ACX3M3ACTCr GGAGGACCIG 960 

AGCGAAAGTG ATGGAATTGA ACAGCTGGAA GAACGACTGA AATATTACCG AATACTCATT 1020 

QATAAGCTCT CTGCTGAAAT ACAAAAAGAG GTAAAAAGTG TTTGCATTGA TATAAATGftA 1080 

GATGOSGAAG GGGAAGCTGC AAGTGAACTG GAAAGCTCTA ATTTAGAQAG TGTTAAGTGT 1140 

GATTTOGAGA AAAGCATGAA AGCTGGTTTG AAAATTCACT CTCATTTaAG TGGCATOCAG 1200 

AAAGAGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TGAACTCCTG 1260 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAAGATG GGTGCCAGTT AAAGGAAAAC 1320 

AGAGCGAAGO AATCTGAGGT TCCCAGTAGC AATGGGGAGA TTCCTCCCTT TACTCAAAGA 1380 

QTATTTAGCA ATTACACAAA TGACACAGAC TCGOACACTG GTATCAGTTC TAACCACAGT 1440 

CAGGACTCCG AAACAACAGT AGGAGATGTG GTGCTGTTGT CAACATAGTT CCAATGGCTC 1500 

CTTTCTGACC TGCTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CTCaTTTTAA 1560 

ATATAACACT CAAAAAAATG TAAATCATAT TGTAGIATTC AATA6TTAAT AAAAACTCGA 1620 
GAAATGTGTT GTTTCTG 



Probeln AcceBsion #i NP_00S43a.l 
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ilAPFGRNLLK TRHKNRSPTK LlDSEEKEIV VMVCQEEKLV CXSLTKHTTSA DVIQALLEEH 
BATPGEKRFI. LGKPSDYCII EKMRGSERVL PPLTRIUCLH KAMGDEQPNM QFVLVKnDAF 
LPVPLMRTAB AKLVQNTEKL HBI.SPANYMK TLPPDKQKHl V8KTPSKIAK IKQDTVSHDR 
DNMETLVHLI ZSQDBTIHQQ VKRMKEU)I<E lEKCBAXFHI. ORVEND6BMY VODAYIMPSP 
SEVEQNLDMJ YEHNQTLEDI. SESDGIEQIB ERLKYYiULI DKLSABIBKB VKSVCIDINB 
nABGBAASBI. ESSHUESVKC DLEKSKI»GL KIBSHLSGIQ KBIKYSDSIiIi QMKRKBYBIiL 
AKEFNSUIIS NKDGCQLKEH RAKBSBVPSS KGBIPPFTQR VPSNYTiroTD SDTGISSKHS 
QD3KTTVGDV VUiST 
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A6CATTGAAG GGGAAGGAAC TGCGGQTGTG GTGTGTGTAT GTGTGTGTGT ATGTGTGTGC 60 

GGCQCGTGCG TGCX3TGTGTQ TGCOOCXXJCr AGTGTGTGGA CAAGGAGGTG GGGGCAGCTG 120 

AGTTASA6TC CCAACTCTTO OACTCXaVrrT GCTATTCTCT TCTTTCTCCC CCACACCTAT 180 

CTGOrGQIGG TAQTGGGOGT TTATATTTGC GTTCCTTTTC ATTCATTTCT AAATCTCTTA 240 

AAAATTTTGG GTTGGGGGrA TTGGC3GAAGG CAGGAAAGGQ AAAAGGAGAQ TAGTAGCTGA 300 

AGAGCAAiSAG GAGGACATGG AGATGAAGAA GAAGATTAAC CTGGAGTTAA GGAACACATC 3 SO 

CCCGGAGGAG GTGACAGAGT TAGTCCTTGA TAATTGCCTG TGTGTCAATG GGGAAATTGA 420 

AGGCCTGAAT GATACTTTCA AAGAACTAGA ATTTCTQAGT ATGGCTAATG TGGAACTAAG 480 

TTCGCTCGCC CGGCTTCCCA GCTTAAATAA ACTTCGAAAA TTGGAGCTTA GTGATAATAT 540 

AATTTCTOGA GGCTTGOAAa TCCTGGCAOA GAAATGTCCA AATCTTACCT ACCTCAATCT 600 

GAGTGOAAAC AAAMAAAAG ATCTCRGTAC AGTAGAAGCT CTGCAAAATC TTAAAAATTT 660 

GAAAAGTCTT GACCTGITTA ACTGTGAGAT CACAAACCTG GAAOATTATA GAGAAAGTAT -720 

TTTTGAACTA CTGCAGCAAA TCACATACTT AGATGGATTT GATCAGGAGG ATAATGAAGC 780 

GCCGGACTCT GAAGAGOAGG ATGATGAGQA TGGAGATGAA GATGATGAAG AGGAAGAGGA 840 

AAATGAAGCT GGTCCA<XXX3 AAGGATATGA GGAASAGGAG GAGGAAGAGG AAGAGGAGOA 900 

TGAGGATGAG GATGAAGATG AAGATGAAQC AGGTTCAGAG TTGGGAGAGG GAGAAGAGGA 960 

ACTGGGCCTC TCATACTTAA TGAAAGAAGA AATTCAGGAT GAAGAAGATG ATGATGACTA 1020 

TGTTGAASAA GGGGAAGAAG AGGAAGAAGA GGAAGAAGGA GGTCTTCGAG GGGAGAAGAG 1080 

GAAACGAGAT GCTGAAGACG ATGGAGAGGA AQAAGATGAC TAGATCATTC TAAGACCRGA 1140 

TTCTCTAATG TTTCTGGGTG TGCAATASAG TGATCACATC TTTGTTTCTT CATGTACaAT 1200 

AGCTATCCCT ACAQAAGATA ATSTGTAACT TTTTATAGGA AAAGTGTGGT TTTACTATTT 1260 

TTGCCTTATC ATTCCAAATA AOAACTAGTC TGTTAATQAT CATATTGTAT GTAGAGAAAA 1320 

ATTTTCArrG ACTCCCATTG TGGAATTCCC TAGCAATTTA TTTAGACTTA ATTTTTTAAA 1380 

TTCAAGCTTA CTGTATTAGT CATTTTTAGC CCATAATTAA AACATGATCA CTTTTAAACA 1440 

GGTGTAGTAT GGTGCATTTC ATTCCTTATT TATAGATTAA CTGAAATTAC AGTTTGCTAT 1500 

AATATAAAAT GACAATAGTC TCTTC5AGTGG TAAGTTGGTT ATTTTTTTAG AGGTQATCCA 1560 

GGAATCTTTA QTTTGAAGGC AQTTACCTTT TTTTTTTTTT TTTTTTTTTG ACTAAGAOTG 1620 

XrrGGTTCCT TTTTTCTCAC AAGTAACTTG GAAAATAGAA GCAGAATAGT AAAGGTTCTA 1680 

TTCAGCAACA TAGTTCaVTGG ATTTTGTGGA GGTTCTATTC AGTAATATGG TTCATGGATT 1740 

TAGTGGTOAC TGATAAQATT TTATTTTTGA AGGAAAAATT GCTTATACTA AGTCCAGAGA 1800 

CATGCAGGTG AGCCXTTTTTG TCAGGCTGCA AATCATOACA TGCCGATGGT TGTTTATTTT 1860 

GTTTTTAGQT GTGCATTCTT TTTCTTCTTA GCAATTCCTT TATOATCACC TTCCCTTCTT 1320 

GTTTCACrCC CTCCCGCTCT CTCAAAAGQA ACrTGGQAAA CTTGTGAAAC CCAOGAAAAC 1980 

CTTTAGTCTT ATACCTCAAC TACGTTTCAG TCCTGTCTGG GTTT TAAATA AGrGAAOIAG 2040 

AAGAAATTGA GTATTTTCTG ACATAAGAAT ATATTATOVA TACAGTTTTA TGCaGTAAGC 2100 

TCTCCTTACC ATAAATGTTT CTTGGTTGAC AACATCTAAO ACAATATTAG TGGGATGAAG 2160 

AAAGAAAAGC AGGGGTGCTT TTGGAAGCAG TQTTAGTGTT CCTCAAAAGT CGGAACAATT 2220 

GCCTGTTGAT ATATTAATAA GACATTAAAG TCAAATTTTA ATGTTGGCCT CTCAAATGAT 2280 

TTGGATACCA CTCTGCAAAG TATTTCTAAC CTTTAATTCX: CAGTTTTAAA ACaGATATAA 2340 

TAATAGCATT TAATTGGAAT ATACTAGGCA GCTGGAAAAG TATTTGAAAC TAAATTGACA 2400 

TTAAAATTAA OATTTGITTT CAAGTGGATG TCCATTAAAA GTAGAAAAAT ATTTGGGATA 2460 

AGTGAGTGTG TGTTTCCrTA CATGGCTACT AAATAAAATA TAATSftOTAT ACAAGTATAT 2520 

CTCCTCTTTT GCTATGOAOa CTCCATGTTC AAGGCAATGQ CTTTTTAAAT CTTGGCTATC 2580 

TAAAATTTTT TCCCTTrOTT TTCAATATTT GTAAGrPTIT AAGAACTTAG TGTCROCAAA 2640 

TTAATTGAAG TTATGCTTCT ATACTGQGAC ATATT TAAA T ACTQACTATA GTACTGCTGC 3700 

TACTGCTTCT ACftATGTAAA ATGTATGACT TGGTGTTTTA AAOTAAAAAT TATGATQTTA 2760 

CTTOTGGAGA AACTAAAAAT GTTGTACAAC TGACCGAAAS AAAACCCTTG GGGATAfifiTT 2820 

TAGTGAGGGG ATIGGAATCC CCAAAAAGAT AACATTTTTC TTCTGCTTTT AAAAACTGAA 2880 

ATTCCCTGTT CTAGTTCCTA ACAATTCTCA TTACATACTA TGCCAGATTA CAAAATACTT 2940 

ATTTTTAAAA TQRAATCTAT ATATTQACTT TCTTATCAAT CATCTTACTG TGCAATCAAA 3000 

ATTAQAGTAC TTTGGTTTGA AAACAACACT TAQAGCCTCK AGA TAACTT T TAAGACT TAT 3060 

TTAGCTTTOT GGGTGGPTATr TTCATOCRAA TAAOTAAGGG TGSGTTTTAT ATTTTGTAGA 3120 

AOTTTTCGGT CCTATTTTAA TOCTCTTTGT ATGQCAGTAT GTATATATTG *GTTAAGTIC 3180 

CTCAAGAATC TCCTTAAAAA CTTTGAAQTT AATACTTTTG TGCAACTGTQ TTTTOAATAA 3240 
AGOCATGACA GTGTTAAAAA CAAAC 
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Seq ID NO: 324 DHA sequence 
Hudeic Acid Accession ♦: NM_003 
coding sequence: 324.. 2722 



80 CCATGCGCGC CGAGCOGGCG TGACCGGCTC 
OCTCTOGCXX3 GCX»CACX3GA GCGGCGCCCG 
CAQCTCXSCGG CAGCXSCXXX; TGGCGGGCTG 
AOOOGGCGCX: GCCGGCTCGG TGCCTGCCAQ 
GCTTCTCQTC CTTCTCCTGC TGCCTCCGCT 

85 GGCTGCtGCQ CCCAGCGCTC CGCATTGQAA 
GGCAQATGAA GACAATACAT TGCAACAGAA 
AATOCAfiAAA GAAATCACAC TGCCTTCAAG 



GGAGCTATGA GCCATGAAGC 
CAGCCTTGCC GGCGCTTCCT 
CGCCCCGGCC CGCACGCCGC 
CGCCGCCTCG TCCCGGCCCC 
TGAAACTGCA GAAAAAAATT 
TAGCAGCAGT AATATCAGTT 
ACICATATAT TACATCAACC 



CCTGCOGCCT 360 
GCGCCTGGGG 420 



TGGGAGTCCT 
ACAGOIATGC 
AAGACrCGGA 
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AAGCX:CTTAT CSWXSTTCTTa 
CCATCTGGCC CAOGCAAGCT 
CATACTGAAC AATGOTTTGT 
ACCACAGTAC TCTAAGGGTG 
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ACACAAAGGC 
TCCAGATTGA 
TGTCTTCTGA 
GAGAGCACTG 



CTTCOTGTAT 
ACATATAATC 
GGAAAGAGC3T 
AGCAGTGAAT 
TAATGATCAC 
AAAGTCCQTO 



ATGATAGAGC 
OMSAAAACCT 
GACCAGTGGC 
CCATCACGTG 
AAAACSTATA 
GTCAACCTTX3 



CACTAGAGCT 
TGGCAGGACA 
CCTTTCTCTC 
GTATATTTGA 



TGOACTTCAT 
GGTTCATGAT 
GTATTCTAAG 



GCAGATGCTC 
GCACCTCATC 
TGTCTGTTCT 
ACAAGTATTA 
AAAGCCAAAA 
GTCCCATTCT 
AGGABGTGOA 
RAATQGATAC 
ATTATGCTGT 
TAACAATACC 
GTGTQATATT 
GCAAGACGGA 
CAGAGACAAC 
CTATGAAAAG 



CATGAGTTCT 
TCGCGGGTGA 
OGCACAAGAG 
TCGCAGAGCC 
TGTGACTGCA 
CGAAAATTTT 
GCCTGCCTTT 



TGGATTCTAT 
GGACTGAQAA 
CAAAATACCG 
CATTTCACTA 
GAGTTGGTGT 
TGGCTCAAAA 
CAGAATCCTG 



TCGAGCTCCA 



AGATTGCAGT 
GGGTCCTAGT 
TATTGTCCTT 
TACTCAGCAA 
ATTCTGGGTA 



CTGTCTCTTT 
GGOGGCAAAA 
ACGAAGGAAC 



AAGAAATGTT 
TCATQTCTTT 
ACTGAATATT 
TATGCATGCA 
CAGTGTCAGT 
CTGAATACRG 
TGCRGCAflAC 
CGTATTGGTC 
ATTGACTGCA 
GATGGAACGC 
GCCCTAAATA 
GTGTGTAGTA 
ATCCGGGATC 
GCCACCAATC 
GGGGGCACAG 
GGCCCCATCT 
TGACATACTC 
TAATGACTAC 
TGGAAATAAT 
GACCATGCTA 
AACACACACA 



TCAACAGGCC 
GGGAGGAGTG 
CCCTCTCCAA 



AGAAATGAAA 
CTCTTCTCAT 
TTACAAGGAG 
GGATCAGATT 
GCAGCGCATT 
TAAGAGAAGC 
GAATGAGTAT 
CCTTGGAATC 
GGGTGQCTGC 
C3VTTTTGGAO 
AACAAAGCTA 



CAAAAACATA 
TCCAAATTCA 
ATTCACTACX3 
GGAAGCATCA 
GGCATGTTTG 
GAGAAAAGCA 
CAAATGAA6A 
TGGTTGAAAA 
TATTTGGAAC 
GCACATACCA 
CAGCTCAACA 
GACATCACCA 
AAGCAGCATG 
AGTCTGRGTT 
GGTCTTCCAA 
CAATGGGAAC 
ATCATGOAQG 
TATAGAOACT 
TTTGAGCCCA 



ATAAGGCTGT 
TTCTTGACCT 
AAAATGGGAA 
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CCAACCCTGT 
CTGATGCTOT 
ACTTTGGAGQ 
TGGCAGTGGC 
CTTCTAQCAG 



QTACTOGAGA 
ATCAAAATCA 
ACATCTGGGG 
AAGGCACTGA 
ATOATGTGTT 
AACTTCAGGG 
GTGGTGCCCA 
CATGTGGCCC 



ATOAAGCX»C 
CAGTTAGSAA 
TCATAATAGG 
GCTGGGGATT 
GAATC3VGCTQ 
GCAGCaGTCT 
GGAGCTAAAG 
GTCAAAGAAC 



CGGGGCTCAC 
AGGGTATGAA 
CTCTGGTCAG 
GGGCCGCTGC 
AACAAAGGCT 
GAAGGGAAAC 
CTGTGGATTC 
TGAGATCATT 
TGTAGTTTTA 
GTCTATGATG 
TCCACTCGAT 
CTGCATTTGT 

cxrrrcAcccc 

CTCCaTCGCT 
TAAAAATGTC 
OGCTGOATGa 
TACTGGAACT 



TGCAGCGACXS 
TGCCGGGATG 
TGCCCACCAA 
TACAATGOCO 
GCAGGGTCTG 
TGCGGGAAGG 
TTACTCTGTA 
CCAACTTCCT 
GATGATGATA 
T6TTTAGATC 
TCCAAGGGTA 
GATTTCACCT 
CCCAAGGATG 
GGTGCCATCC 



CAAAAATTAA 



ACCTTTCACK 
TGTTCCAGAA 
ATGOVATAAA 



ACACCGCCTT 
ATTAAGTTTG 
AAGGATGGGG 
ACCTGTCAGT 
TCTTTTTTTT 



AA6ATGATAC 900 



1020 
1080 
1140 

1260 
1320 
1390 
1440 



1740 
laOQ 
1860 
1920 
1980 

2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2520 
2580 

2700 
2760 
2820 
2880 
2940 
3000 



TTTTACaGAG 
CGGAATGTGG 
AATGCTATGG 
GGCCCTGCTG 
CTGTGAACGA 
ATCTTCATAA 
AOTGCAAGAC 
ACAAGTTCTG 



CCAATCTTAC 
TCTACCATCA 
CGGATGTGGG 
GGAAGTGCCT 
AAGTCTGTTC 
GGGCAGGGAC 
AA6QACCCAA 



GGTTCGATCTC 
GCACTGTTGG 
TAAACAAAAC 
TAAAAGAAAA 
AAACGGGGGA 
TCCCTAATGG 
AAAA 



1 



MECFFGS8SRQ PPLAGCSLAG 
RPRAWGAAAP SAPHHNBTAE 
DJQDSESPYH VLDTKAHHQQ 
HYENQKEOyS IKSGBHCyYHG 
RST6RPHIIQ KTLAGOYSKQ 



1 



ITTOPVQMLH 
IiPMAVAQVLS 
RDFLQHGGGA 
SDGPCCNNTS 



EFSmRQRIK 
QSLAQNI.QIQ 
CLFNRPTKLF 
CLFOPRGYEC 
OQYIWGTKAA 
IGQLOGEIIP 



ASCGPQRGPA GSVPASAPAR 
KNLGVLADED NTI^QNSSSN 
KHNKAVKLAQ ASFQIBAFGS 
8IRGVKDSKV ALSTCHGLHa 
QWPPLSELQW 



QHADAVBIiIS RVTPHYKRSS 



TPPCRLLLVIi LLLPPLAASS 
ISVSHAMQKE ITUPeRLIYY 
KFIUJLimN GLLSSDWEI 
MFEDDTFVYM lEPLELVHDE 



BPTECGNCyV EAGEECDCGF 
RDAVNECDIT EYCTGDSGQC 
GSDKFCYEKIi NTEGTEKGNC 
TSFYHQGRVl DCSGAHWLD 



U]TRVVI.VAV ETWTBKIX2ID 
LSYFGGVCSR TR6V6VNEYG 
MEETGVSHSR KFSKCSILEY 
HVECYGLCCK KCSI.SNGAHC 
PPNLHKQDGY ACNQNOGRCY 
GKDGDRWIQC SKHDVFCGFI. 
DDTOVGYVGD GTPCGPSMMC 
FTHAGIDCSI RDPVRNI<HPP 



CXTTCTCCauV 
ACCTGAGAGC 
AACAGATOQA 



I 



I 



EDEGPKQPSA TNLIIGSIAC AILVAAIVIiG GTGHGFKNVK KRRFDPTQQG PI 

Seq ID NDi 326 UNA sequence 
Nucleic Acid Accession #t AK074418.1 
Coding sequence: 244-isiS 

1 11 
I I 

GACGGCCGGC 
CTGTGGAACC 
AAAAQTQAGG 
GCTGGAGTTC 
ATTACCAGGA 
CCTTGCGGGA 
CAGATTCTTC 
AGCGGCXXCA 
TTGACATCCA 
CTCRGAACCC 

cTGGca.TTrr 

ACCGCCTACC 
AGTTCTGGCC 
TGCACTATGG 
TCCATCTGCA 
GCTCCXTTGAT 
ATGGGCTGGT 
CAATACGQAA GGGGCTGGGA 



GTGATTCATG 
CAAAACCAAG 
TATTCCQATC 
ATCACCAACA 
ACCAAGGCAG 



ACAGTACAGG 
CCGTTTCCGG 
TGTCCAGQGA 
CTGCXTOCTG 
CTTCCTCGAG 
CTCTTCCCCT 
AACCTGTQCC 
GAGTCTCCAT 
AGAAATTATC 
V GAGGGCGCTG GAGTGATGGG 



CATGCTCTCC TCCTCTGCCA GTCTCCTCCA CCACTCTCTA 60 

TGCCOSTCTC CCCTCCTCCA TCAGACACAC CTGCCTAGGA 120 

GACCX5GTGRG TGACTTGCTG CTAAAGTTTA TACCAGATGC 180 

TGCTGTGCCT G6AAA6QACC TCGGAAOTCT TCTAAGGAGA 240 

GCCTTCAGTG GAGACCTCCA TCATCAACTT 300 

TCACTCCCTG AGCATQGGCC GGAGOmAA QGATQAGACA 360 

CATAGGCCAG AAGCTGCTCC AGGAAAAACO CCTCTCCAAT 420 

GGATCTACCA GGGGGTCCTC CTCACTTCAT CCTGGATGAT 480 

ACAAGSAGOC GCAGCrOACT GCTGGTTCCr GGCAGCACIG 540 



CAGAAGATCC TGATGGTCCA AAGCTTTTCA 
TTCTGGCAAT GTGGOCW3TG GGTGGAAGTG 
GATAAATGCC TCTTTGTGCG TCCTCGCCAC 
GAGAAGGCCT ATGCXMAflCT GCTCGGATCC 



ACTCCftAGTG 

GCCTAC&CTG TGACIGG6GC lt»GCAGATT 
TCrCieTGOh ACCCCTGGG6 CXGGGGCXSAa 
TCTCAOGAGT GG6AGGAAAC CIGTGATC03 



1020 
1080 
1140 



308 



CGQAAAAGCC AGCTACATAA OAAAOGGGAA GATOGCSAQT TTTQGATaTC QTGTCflAGAT 1200 

TtCCMiaah AATTCRTOGC CATOTTTATA TGTAGOGAAA TTCC3ATTAC CCTGGACCAT 12S0 

GGAAACACAC TCCACC3AAGG ATOGTCCCAA ATAATGTTTA GGAAGCAAGT GATTCTAGGA 1320 

AACACTGCAG GAGGACCTCQ GAATCSATGCT CAATTCAACT TCTCTGTGCA AGAGCCAATG 1330 

OAAGGCACCA ATGTTGTCGT GTGCGTCACA GTTGCTGTCA CACCATCAAA TTTGAAAGCA 1440 

OAAGATGCAA AATTTCCACT CGATTTCCAA GTOATTCTGG CTGGCTCACA GAAACACTGT 1500 

CX3UVAGCTCA AATAATAAAT TCOSCCGCAA CTTCACCATG ACTTACCATC TGAGCCCTGG 1560 

GAACTATGTT GTGGTTGCAC AGACACGGAG AAAATCAGCG GAOTTCTTGC TCCGAATCTT 1620 

CXTGAAAATG CCAGACAGTG ACAGGCACCT QAGCAGCCAT TTCAACCTCA GAATGAAGGG 1680 

AAGCCCITCA GAACa«X5GCT CCCAACAAAG CATTTTCSJM: AGATATGCTC AGCAGGTATG 1140 

OTACCTACCA CCCAGOOGCC rrAOOTGOGA TTGGAGAAAG GGGACCTGAG GGAGGGACAG 1800 

CCCrCACAGG CCCTTACTOCS GATGCAGAOA GOAGAAGTaA CTTGATGGAC TATTTTACCI 1860 

GCCTCTCTTC CTOGATCXSTC TCCAGAACTG CTGTGGCTGC CAAGCTCGGT AGAGACGTGG 1920 

CQCCCCACCC AQTCTCATCC GGGGGACTTC AAGCTGGAAT GCAGAGCTTA GAAAGQOAGG 1980 

GGATAATTAT GGGGTGTGAG GTGCATTGCC CTCTAAATCT TTAAACAAGC AATTGGCAGT 2040 

ACCCCGTGRA ACCTTTCCTr CTCCTACrC& GCCACCTCCC ACCAACCTGG CATCGTTCCr 2100 

CCCXSGGAGCr AGCCAGCTTC AGAAAGCACA TACAGCATCC TTGCTGCCAA ACCACCTATG 2160 

TGCACACAGG ATTTCCTTAA TGGCTTAAIA AACTGTTATA AftOAACTCCT TGACrTGTCA 2220 

OAATAAAATA GCTaCCAGGG GCTCTGCACA ATQAGCXTTCT TACCGTTAAA AAAAAAAAAA 2280 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 327 Protein sequences 
Protein Accession #: BAB8507S.1 

1 11 21 31 41 SI 

I I I I t 1 

MAYYQEPSVB TSUKFKDQD FTTLRDHCLS MQRTFKDETF PAADSSIGQK IiLQEKRLSMV 60 

IVIKRPQDI>PG GPPHFILDDI SRFDIQQGGA ADCWFIAALG SIiTQMPQYRQ KILMVQSFSH 120 

QYAGIFRFRF WQCGQWVBW IDDRLPVQGD KCLFVRPBHQ NQEFWPa.IiE KAYAKLUSSY 180 

SDIiRyOFIiED ALVDIiTGOVI TNIHLKSSPV DLVKAVKTAT KAGSIiITCAT PSGPTOTAQA 240 

HBHGLVSIiHA YTVTGAEOIQ YRRGMHEIIS LWNPHGWGET BWRGHWSDGS QBWEETCDPR 300 

XSQLHKXKED GBFWMSCQDP QQKPIAMPIC SEIPITIiTOia STtHEGHSQI MFRKQVIU3H 360 

TAGGPBNDAQ FNFSVQEPMB GTMVWCVTV AVTPSNI.ICAE DAKFPLDPQV ILABSQKHCP 420 
KLK 



Seq ID MOi 328 DNA sequence 

Muclelc Acid Accession fts BC0n490.1 

Coding sequence: 74-2768 

1 11 21 31 41 51 

1 I 1 I I I 

6TGGGTCAOQ TQAACCACTT TTCaOSCQAA ACCTGGTTGT TOCTGTASTO GCBGAO AGG A 60 

TCGTOGTACT GCTATGOOGG AATCaTCOGA ATCCTTCACC ATGGCATCCa GCCCGGCCCA 120 

GCGTOGGCGA GGCAATGATC CTCTCACCTC OVacCCTOBC CGAAGCTCCC GOCGTACTOA 180 

TOCOCrCACC TCCaVGCCCTG GCCGraACXrr TCCACCATTT GAGGATGAGT CCGAGGGGCT 240 

CCTAGGCACA GAOGGGCCCC TGGAGGAAOA AGAGGATGGA GAGGAGOTCA TTGGAGATGG 300 

CATGGAAAGG GACIACOGCX; CCATCCCAGA GCTGGACGCC TATGAGGCCG AGGGACTGGC 360 

TCTGGATOAT GAGOACGTAG AGGAGCTGAC GGCCAGTCAG AGGGAGGCAG CAGAGCGGGC 420 

CATGOGGCAG OGTGACCGGG AGGCTGGOCG GGGCCTOOGC CQCATGOGCC GTGGGCTCCT 480 

QTATOACAGC GATGAGGAGG AOGAGQAGCG CCCTGCCCGC AAGCX3CCX3CC AGGTGGAGCG 540 

GGCX3VCG6AQ GACQGOQAGQ AGGACGAGGA GATGATCGAG AGCATCGAGA ACCIGGAGGA 600 

TCTCAAAGOC CACTCTCTGC GCXaGTGGGT GAGCATQGOO 6GCCCtX3GQC TGGAGATCCA 660 

CCACOGCTTC AAOAACTTCC TGOGCawrrCA OGTOGACAGC CAOGGCCACA AOGTCTTCAA 720 

GGAGOQCATC AGCGACATOr OCAAAGAGAA COGTGAGAGC CTGGTGGTGA ACTATGAGGA 780 

CTTGGCAGCC AGGGAGCACG TGCTGOCCTA CTTCCTGCCT GAGGCACCGG CGGAGCTGCT 840 

GCakGATCTTT GATGAGGCTG CCCTGGAGGT GGTACTGGCX: ATGTACCCCA AGTACGACCG 900 

CATCftCCAAC CACATCaVTG TCCX3CATCTC CCACCTGCCT CTGGTGGAGG AGCTGCGCTC 960 

GCTGAGGCAG CTGCATCTGA ACCAGCTGAT COGCACCAGT GGGGTGGTGA CCAGCTGCAC 1020 

TGGOQTCCia CCCCAQCrCA GCATGGTCAA GTACAACTGC AACRAGTGCA ATTTCGTCCT 1080 

QaOTCCrrTC TGCCAGTOCC AGAACCAGGA GGTGAAACCA GGCTCCTGTC CTGAGTGCCA 1140 

STGGGGOGGC OCCTTTGAGQ TCAACATGGA GGAGACXATC TATCAGAACT ACCAGOGTAT 1200 

CCGAATCCAG GAGAGTCCAG GCAAAGTGGC GGCPGGCEGG CTGCCCOGCT CCAAGGACGC 1260 

CATTCTCCTC GCAGATCTGG TGGACaGCTG CAAGCCAGGA GACGAGATAQ AGCTGACTGG 1320 

CATCTATCAC AACRACTATG ATGGCTCCCT CAACACTGCC AATGGCTTCC CTGTCTTTGC 1380 

CACTGTCATC CTROCCAACC ACGTGGCXauV QAAGGACAAC AAGGTrGCTG TAGGGGAACT 1440 

GACCGATGAA GATGTGAAGA TGATCACTAG CCTCTCOUVG GATCAGCAGA TOGGAGAGAA 1500 

OATCTTTGCC AGCATTGCTC CTTCCaTCTA TGGTCATGAA GACATCAAGA GAGGCCTGGC 1560 

TCTGGCCCTG TTCGQAQGGG AGCCCAAAAA CCCAGQTGGC AAGCACAAGG TACGTOOTGA 1620 

TATCRAOOTG CTCTTGIGCQ GRGACCCTGG CACAGOSAAG TCGCAGTTTC TCAAGTATAT 1680 

TQAGAAAGTG TOCAGOCGAG CCATCTTCAC CACTGGCCAG GGGGCGTCGG CTGTGGGCCT 1740 

CACQG03TAT GTCCAGCGGC ACCCTGTCAG CAGGGAGTGG ACCTTGGAGG CTGQGG(XCT 1800 

GGTTCTGGCT GACCGAGGAG TOTGTCTCAT TGATGAATTT GACAAGATGA ATGACCAGGA 1860 

CAGAACCAGC ATCCATGAGG CCATGGAGCA ACAGAGCATC TCCATCTOQA AGGCIGGCAT 1920 

CGTCACXrrCX: CTGCAGGCTC GCTGCaCGGT CATTGCTGCC GCC3VACOCXA TAGGAGGGCG 1980 

CTACGACCCC TCX3CTGACTT TCTCTGAGAA CX3TGGACCTC ACSkGAGCCCA TCATCTCA06 2040 

CTTTCACATC CTGTGTGTCG TGAGGGACAC CGTGGACCCA GTOCAGGAOG AGATSCTGGC 2100 

CCGCTTCQTG GTQGGCAGCC ACGTCAGAC3V CCACCCCAGC AACAAGGAGG AGGAGGGGCT 2160 

GGCCAATG6C AGOGCTGCTG AGCCGGCCAT QCCCAACACG TATGGCGTGG AGCCCCTGCC 2220 

CCAQ6AGGTC CTGAAOAAGT ACATCATCTA OGCCAAGGAG AGGGTCCACC CGAAGCTCAA 2280 

CCAGATGGAC CAGGACAAGG TGGCCAAGAT GTACAGTGAC CTOAGGAAAG AATCTATGGC 2340 

GAC3VGGCAGC ATCCCCATTA CGGTGOGGCA CATCGAGTCC ATOATCOQCA TGOOGGAQGC 2400 

CCAOGCXSCGC ATCCATCTGC GGGACTAT6T GATGGAAGAC GACGTCAACA TGGCCATCCG 2460 

CGTGATGCTO GAGAGCTTCA TAGACACACA GAAGTTCAGC QTCAJ60GCA G CATG OGCAA 2520 

GACTTTTGCX: CGCTACCTTT CATTCCX3G0G TGACAACAAT OA6CTGTTGC TCTTCRTACT 2S80 

GAAGCAGTTA GTGGCAGAQC AGGTGACATA TCAOOGCAAC GGCTTTGGGa OCCAGCAGGA 2640 

CACTATTGAO GTCCCTGAQA AGGACTTGGT GGATnAGGCT OSTCAGATCA ACATCCACAA 2700 



309 



10 



wo 02/086443 

CCTCTCTGCA TTTTATOACA 
AAGGAAAATG ATCCTGCAGC 
TTCTGQTTTQ GGOTCGTCAG 
TGAACTCGGG GTACTAGQGT 
TCrrraiTTC TCXaAGCCCG 
TQTCTTACTT GGTTGCTGAA 
TTGGATCAGA GCTGCTC3A.GT 
ATGGATGTCA GGAGAGCTGC 
TGCCTTTGGC CAGAGAGCTG 
TCTGTGCCCE TGTGOTGGAA 
GTGQCAGGGQ TGGGATOrGA 
TTGCTCCCTG TCTGTTTCCC 
TAATTTTTAA TAAAGTTGAA 



TGCCCTCTGT 
CAGGGCTTAT 
CTTTGTGCTT 
CATCTTGCCA 
TCAGGATGCC 
TGCXCTCTTG 
GTTGAAGATG 



6ctttaigga 
agcagqatgt 

CTCACCTTTQ 
CCTCCGAGTG 
TGCX3TGTGGT 
GCGTGAGTTG 
TTTGTAATCG 
CAGTGCCAGC 
TTATCCACTC 
TTTGTGCATT 
AAAAAAAAAA 



ACGACCTGAA 
TTCCTTGGGA 
CACAAAACCA OAGCACTTGA 
CTGGCTGCAC CTGGCATGAC 
GGTGGGATGC 
CTTTGTCTCC 
TTAGGTGTTA 
CGTATTCaGG 
TTTTCAGTCT 
GCAGCGTTCT 

QCCACAOTTA 

OGGTTTGGTT TCTOTAGTIT 
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ACTCAGTACC 
GCCTTCTTAC 
CTGCTTTTGC 
CCTGCAGGTT 
GGGCTCCTCA 



2680 
2940 
3000 
3060 
3120 
31S0 

3300 
3360 
3420 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



DREAGRGU3R 
SVREV)VSh4AG 
SKVIiAYFIiPG 
HIiNQLIRTSG 
FEVNMEETIY 
HYDGSLHTAH 
lAPSIYGBED 
SRAIFTTGQ6 



CWRDTVDPV 



PRLEIHHRFK 
APAELLQIFD 
WTSCTGVLP 
QNYQRIRIQB 
GPPVPATVIL 
IKROLAIiAIiF 
ASAVGLTAYV 
ISKA6IVTSL 
QDEMUUIFW 
VHPKLNOraX) 



RRQVERATED 
GHNVFKBHIS 
YPKYDRITNH 
KOIPVLGPFC 
PRSKSAILUV 
VAVGBLTDED 
HKVRGDINVli 
LEAGALVLAO 
QARCTVIAAA NPIGGRYDPS 



EAALEWLAM 
QIjSMVKYNCN 
SPGKVAAGKL 
ANHVAKKDHK 



DKVAKNYSDIi 



AEQVTYQRfJR FGAQQDTIBV PBKDLVDKAR 



BUtDYVIBDD 
AEQW 
UQF 



Ssq ID NO: 330 E 
micleic Acid Accession #: Ml. 7254 
Coding sequencei 257-1645 



RKESMATGSI 
HRSMSKTPAR 

QnnHmcsAF 



DMCKENHESIi 
IHVRISHLPL 
QSQNQEVKPG 
DLVDSCKPGD 
VKMITSLSKD 
LCGDPGTAKS 
RGVCLIDEFD 
LTPSEHVDLT 
AAEPAHPNTY 
PITVRHIESM 



EAAERAMRQR 
lENIiESLKGH 
WHYEDLAAR 



EIELTGIVHN 
QQIGEKIFAS 
QFI.KYIBKVS 
KMNDQDRTSI 
BPIISRFDIL 
GVEPLPQEVL 
IRMAEAHAKI 
LLLFIIOCQLV 



I 



GTCCGCGOST 
CX3GGTCGCAC 
CTTTGGAGAC 
AAAGATGaCA 
TGGCTTACTG 
CTTATCAGTT 
GGCTAAGACA 
CCCACGCOTC 
GGAATGTAAC 



GTCCGCGCCC 
TAACTCCCTC 
CCXJAGGAAAG 
GAACCAAGGG 
AAGGACATQA 
GTGAGTGAGG 
GAGATGACCG 
6CTCAGCAQO 
CCTAOCCAGa 



GOGTGTGCCA 
GGCGCCGACG 
CCGTGTTGAC 
CAACTAAAGC 
TTCAGACTGT 
ACCAGTOGTT 
CGTCCTCCTC 
ATTQGCTGTC 
T6AATGQCTC 



41 
I 



CGTCAGGTTC 
CCCX3GACCCA 
GTTTGAGTGT 
CAGCGACTAT 
TCAACCCCCA 
AAGGAACTCT 



AGCAGATCCT 
AGAATATCGC 
GTOCAAGATG 
TCTCTCACAT 
TGATAAAGCC 



GOAAGOCACC 
AGAGCGGAAG 
CTATGACAAG 
CCACGOOATC 
CTCAGACXTC 



GAGGGACrTTA 
GGACATATCA 
AAGGACAAAG 
AATCOC3VCTA 
AACATACCGT 
TCAAAAACAA 
ACTGCATGGC 



CACATGCCAC 
AOGCTATGGA 
CTTCCA6ACG 
ACCAAGGACG 
CTCCACTACC 
TTACAAAACT 
AGGAQATCAG 
TCTCCTTCCA 
CTTGOACXAA 
TTGCICCIoa 

aaoggsqaot 
agcaaaccca 
aacatcatga 
gcccaggccc 
ccxita:catgg 
ccagccctcc 

CCAA CTGG GO 
gqcacttact 
TCOCCAGAAA 
TTTACTGGGG 
CTGAAGTCTT 
TCTQTGGACT 
TGCCAAAGAA 
ATGCAAACTG 
TTATAATGCC 



CCCCAAACAT 
GTACAGACCA 
TCAACATCTT 
ACTTCCAGAG 
TCAGAGAQAC 
CTCCACX3GTT 
CCTGGACCGG 
CAGTGCCCAA 



TGGCOQTG 
CCTCTCGGTT 
ACAAATGACT 
TGAACAGCTO 
QCAGCrCATA 
GCCTACGGAA 
GOACAGACrr 
GCCAGGGTCA 
CCTGATGAAT 
AT6AACTACG 



6TAGATGGQC 
TCAAGGAAGC 
CGCCACACCT 
CCAAGAtGAG 
CCATCAAAAI 



TGTGGGQCAG 
GTTATTCCAG 
GCTCACCCCC 
TCCTCTTCCA 
AATGCATGCT 
TCaCGGCCAC 
AACTGAAGAC 
CCTT6CAAAT 



AACATCGATG 
AGCTACAACG 
CATTTGACTT 
AGAAACACAG 



ACATSAACTA 
CCAAGGTCCA 
TCCAGCCCCA 
GCTCCTATCA 
CCQTGACATC 
GTATATACCC 
ACTAAAGftCC 
CTCTATCGGA 



GOATCCCQAC 
OQATAAGCTC 
TGGGAAGCGC 
CCCCCCC3GAG 
CGCCCACCCA 
TTCCAGTTTT 
CAACACTAOG 



TACGCCTACA 
TCATCTCTGT 
CAGAAGAT6A 



ACTACAGAAA 
GACCTTGTAA 
AGTGGTCTTA 
GGATGAAACT 
ATTTTAAGGA 



AIGXaCTOTT TTGaTTOAAA 



ACTATGAACT 
AGGGGTGAAG 
TCTCAAGCAA 
ACTCXjAGGGT 
GTAATGOAGA 
TGTCAAAIGA 
TCATTATGTQ 



TCATGCAGTC 
AAGGGAAGTA 
AAATTTTAAC 
GGGGCTTTOT 



ACTGAGGATa 
AASAG6CAGA 
ACTCAGGACA 
AGTGTTATAC 
GTAGAATTCA 
TGGAATTaTC 
TCTCCACAQG 



GAACA1GAAT 
AAGCOGGGGA 
TGAGGAGGAT 
AAGACAGTGT 
AGAAATGTAT 
AAAGCAATAG 
AAACTACCTG 
CTGTGGCCCA 
TOAATACAT 
AGTTTCC3UIC 
TGTATAGAGT 



TTTTCCCATC 
CAAAACI60C 
AOAOATCCAA 



GCTAAAAATG 
ATGTA6AAGC 
AAACTTTAGA 
AAACAACACA 
TATTTAAAAA 



TTTGGGGACT 
CAAACCCAGT 
GAAACAAAAA 
TGATATTTAA 



TCCGTTTGAT 
TCCTTTACAG 
GAGCGTGTGA 
ACCAGGCTGG 
GTGTACAATG 
GTTAGGAGAA 



GGAAGGAACT 
CCGACATCCT 
CAGATGATGT 
ATTTACCATA 
AGTCGAAAGC 
AGTTAGATCC 
GCCAGATCCA 
GCATCACCTG 
QOCGCTGGGG 
TCGG'ITACTA 
AGTTCGACTT 
ACAACSTACCC 
ACTTTGTGGC 
CAAACCaVTA 
QCCATATGCC 
ABCCTGCATT 
TCAAGAGGAA 
AGACTCTTGG 
TCACGAATAT 
ATGAAGTCTT 
GTAGAGTTTG 
GTTTTGACCT 
TAGTTTCATA 
TTGATATGCA 
GGACAGCTGT 
TATTACCGGG 
TTGTAGACAG 
GAAAGAAACT 
AGTTATGGAa 
AGGACACAGC 



1020 

loao 

1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



1920 
1980 
2040 
2100 



2280 
2340 
2400 
2460 
2520 

2sao 

2640 
2700 



310 



10 



wo 02/086443 

ACAATCAGAA ATCAC6CAG6 
AACGCTGTGC GTTTGTCAGA 
ATAATTATAT AACTTATGCA 
C6AC3kAAAGA GACaUlTCaAT 
TACAATATGA AQTTATTAGT 
TAGCATGGCA AATCAOATTT 
TTGCTTAAT6 AAAACATQT6 
GGAACTTGIG CAAGGGAI3A6 



CATTTTGGGT AGQOGGCCTC CAGITTTCCT TTCAGTCGCX3 2760 

ATGAAGTATA CAAGICAATO TTTTTCCCCC TTTTTATATA 2820 

TTTATACACT ACGAGTTGAT CTCGGCCAGC OlAAGACACA 2880 

ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT 2940 

TCTTAGAATQ CAGAATGTAT GTAATAAAAT AAGCmSOCC 3000 

ATACAOOAGT CTGCATTTGC ACTTTTTTTA GIXSACTAAAG 3060 

CTQAATGTTa TGOATTTTGr GTTATAATTT ACTTTGTCCA 3120 
CCAAGGAAAT AGGATGTTTG GCACCC 



PCT/US02/12476 



11 



31 



41 



51 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



I I I I I I 

MIQTVPDPAA HIKBAIiSWS EDQSIiFECAY GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ 
QDWLSQPPAR VTIKMECNPS QVNGSRNSPD ECSVAKOGKM VQSPDTVGMN YQSYMEEKHM 
PPPNMTTNER RVIVPADPTL WSTDHVRQWL EWAVKEYGLP DVNILLFQHI DGKELCKMTK 
DDFQIlI.TPSy MAOIUiSHIJI YLRETPLPHI. TSODVDKALQ HSPRUfHARN TOLPYEPPRR 
SAHTGH6KPT PQSKAAQPSP STVPXTEOQR PQU3PYQII.G PTSSRUWPG SGQIQLHQFL 
LELLSDSSttS SCITWEGtNG BFKMTDPOEV ARUWGERKSK PHMNYDKLSK ALRYYVDiOII 
MTXVKGKRYA YKFDFHGIAQ ALQPBPPBSS IXXXVSDLEY HGSYHAKPQK MNFVAHIPPA 
LFVTSSSFFA APHPYWISPT GOIYPNTRLP TSHHPSHLGT YY 

Seq ID MOs 332 DMA aeguence 
Nucleic Acid Accession #: NM_000020 
r 283-1794 

21 31 41 ' 51 



AGAAACATTT 
GAGCQAGCCC 
CCAGCGCTGG 
AGGCTAGCGC 



TCTCGGGGCC 



TTATTAGGAG 
TTGCTCCAGC 
CTCCCCGQCT 
CGGTGCAACT 
CCCGCCACCC 
TTCTOATGCT 
CGCTGGTGAC 
GGTGCACAQT 



CCAGCCCGGT CCX3GGGCCGC GCCGGACCCC 
GGTGGAGGGG 
CCCAGAGOSA 
TTGGTGACCC 



GCAGAGCGGG 
GCTGATGGCC 
CTGCACGT«3T 



CACTACTGCT 
CAACCTCCTT 
CTGGCCTTGC 
CAGG2USAAGC 
TCTGAGCAGG 
GGCTCAGGGC 
TGTGTGGGAA 
GCCSTCAAQA 
AAC3VCAGTAT 
CGCARCTCGA 
GACTTTCTGC 
GCATGCGGCC 



AGCGTGGCCT 
GCGACACGAT 
TCCCCTTCCT 
AAGGCCGCTA 
TCTTCTCCTC 
TGCTCAGACTA 
GCACGCAGCT 



CAOGGAGCTC 
CCTCTGCAAC 
GGGAACAQAT 
GGCCCTGGGT 
GCACAGCGAG 
GTTGGGGGAC 
GGrQCAGAGG 
TGGCGAAGTG 
GAGGGATGAA 
CX3ACAACATC 
GTGGCTCATC 
GCTGGAGCCC 
GCACGTGGAG 



CGGGAGGAGG 
TGCAGGGGGC 
CACAAOGTGT 
GGCCAGCIGG 
GTCCTGGGCC 
CTGGGAGAGT 
CTCCTGGACA 
ACAOTGGCAC 



CCATGACXTTT 
AGGGAGACCC 
ATTGCAAGGQ 
GGAGGCACCC 



QAGATTGCCC 
GATGTGGTSC 
CAGACCCCCA 



TGGGCACCAA 
TTGAGTCCTA 
GCOGGACCAT 
CCAATGACCC 
CCATCXXTTAA 
AGTGCTGGTA 
AAAAAATTAjS 
TCCTTTCTGC 



CX3TGAATGGC 
CAGCTTTGAG 
CCGGCTGGCT 
CCCAAACCCC 
CAACAGTCCA 



CaMSTCCTGGT 
CTAGGCTTCA 
ACGCACTACC 
CftTCTGGCTC 
ATCITCGQTA 
CTGGTCAAGA 



6ACATCIG6G 
ATOGTGGAGG 
GACATGAAGA 
GCVkGACCCGG 
TCTGCCCGAC 
GAGRAGCCTA 
CTG6GGGGGT 



GTCTG60CTG 
CAGCATGGTG 
GTGCCAAGCC 
CCXTTTGATCA 
CCCTGGCACA 
CCCATCAGTT 
TCCTCAACAA 
ACTAG6QCAT 
AAAAGGGCA6 
GCCAAOCAXG C 



CC&GCTCACC 
GTAS CTGGG A 
CAGGGTTTCA 



CTCAAAQCGG CAGGCTCCCT 
CACCCCCTAC CACTCCCGGG 
AGGGAATCCC AGTCCCAGAC 
ACCCCACTGC CXCACCAGAG 
CACTTCCCTG CCAGGCCTCA 
TCTCTCTCTG GATTTGTATC 
OA6TGCAGCT TGCTGAATGT 
TAAATCCTAA GAGGTOCTAC 
6TCAGATGGO CAAGGCCCAO 
GCAGGGGGAA GGTCAGTGGG 
GTGACAAAAG CAGGCCTGTC 
GACACGGAGT TTCGCTCTTG 
GCAAOGTCTA CCTCCCAGGT 
TTACAGGCAC ATGCCACCAT 
CXIATGCTGGC CATGCtGGTT 
TGGGGTTACA 



ACAGGATGCA 
TCAGAGCCCG 
CTGCCAGGGT 
QCCTCTAGCA 
TCAQCTCCAT 



CCCTGGTGCT 
CCCTGATCCT 
TGTGGCATGT 
CCAGTCTCAT 
GTGACTGCAC 
GGCnQGTTGC 
TOTQQCACGG 
TCC3GGGAGAC 
TCGCCTCAGA 
ACXiAGCACGG 
TGAGGCXAGC 
CACAGGGCAA 



CCTTTGGCCT 
ACTATAGACC 
AGGTQGTGTG 
TCCTCTCAGG 
TCACCGCGCT 
AAGTGATTCA 
GGGGGGCAGT 
QATGGGCAGC 
CTGAAACCTQ 
TCTCTOCCCA 
AAAGAGGCTC 



I 

CGCTGGAATA 60 

CAGCTGCGCC 120 

AGCXXX3CCGT 180 

GGTCCGCCGA 240 

GGGCTCCCCC 300 

TGTGAAGCCQ 360 

GCCTACCTGC 420 

CCAGGAACAT 480 

GTTCGTCAAC 540 

GOAGGCCACC 600 

OGGCCCCGTO 660 



GGCACAGGGC 
TAAGCTCXaVG 
GATGCCTTGG 



TTGTCCAGGC 
TCaAATCATT 
GCCTQGCTAA 
CTCGAACTCC 



CTAHJITCTCT 
ATGCrCCAGC 
CAAOGAGTGT 



AT6GGCICTA 



CCCTGGCAAT 
CTGGAGCACC 
GCCACCCTTG 
CTGTGGCATA 
CTCAAAAGAA 
AGGCC!CACAC 
GAGAGACACA 



CCTATATCKC 
TTGCCTCAAG 
TCXTTAGTCTA 
GGCrCAGACA 
GTCTTCTCTG 
ATTTGGCTCC 



TAGA6TGCAA 
CTCTTGCCTC 
TTTTGTATAT 
TQACCTCAfiG 
ATOGCGCCTQ 
CCTTTGAQGC 



TQAGAGTOTG 960 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 

1740 
1800 
1860 

1980 
2040 

2100 
2160 
2220 
2280 
2340 
2400 



TGGCATGATC 
AGACTCCCXyi 
TTAGTAGAAA 
TGTTCCACCT 



CATGACCTCC 
CTCCCTCTAC 
TGTGTCCGCG 
ACCAGCCATT 
GTSTTGCATC 



G6TGCTGTGG 
ACCCTTCTAT 
TQTGGATCAG 
CCTAGCTCAG 
GCGGATCAAG 
ATAGCCCAGG 
GGATGGTGCC 



ATCCCCTGCT 
CCCCTATGGC 
CAGAGTCAGA 
TTGCCCCCTG 
CCTGTCC3U3C 



GCTCTGGGCC 
CXXCAGGACT 
ATCCAAGAAG 



QAAAATAACT 
CTCCAGTTCT 
TTTTCACCAC 



CAGAAAQTTT 



GCTCCAGCTC 
GTGTGTCTCA 
GAAATTTTCA 



TCTATTCCIT 
TTACCTGACT 
TGCCTAAAAC 
AAGCCAGCCC 
TTCCTCCAAG 
CCCTACTGGC 
GOAGAATTCA 
AGGRTGTATG 



2940 
3000 
3060 
3120 



311 
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TAT<3GYTCAC GTATGGWGCaV GGTTGTCCTG GTCCyKGGGT OCAGOGAAOT GOGCTaCAGQ 3S40 

GAAGTGGATT GGAGGGGAGC TTGAGGAATA TAAGGAGCGG GQGTGKUKSVC TCAGGCTATO 3600 

GRCAAGGACA GCCCCAAGGT TGGGAAGACX: TGGCCTTAGT CGTCCTCAGC CTAGGGCAGG 3660 

GCAGTGAAGA AAGCTCTCCC CX3CTCCTGCT GTAATGACCC AGAGTAGCSTT CCCC»GGCCG 3720 

GCATCITATO TGTGTCTTCC ACXAXCCTCA TGGTGGCACT TTTCTAGGCC TGTCTCCCAG 3780 

CATTGTGCAA GGCTCGGAAG AQAACCAGGA AGTGAAACTG GGTGAAAACA GAAAGCTCAA 3840 

TOQATGGGCT AGGTTCCCAG ATCATTAGGG CAGAGTTTGC ACGTCCTCTG GTTCACTGGG 3900 

AATCCRCCCA GCCCACGAAT CATCTCCCTC TTTGAAGGAT TTTWATTTCT ACTGGGTTTT 3960 

GGAACAAACT CCTGCTGAGA CCCCACAGCC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC 4020 

TGGAAAATCC CTAAGAGAAG GCCTGGGGOA MAGGAAKTGG AGTGACAGG3 GACAGQTAOA 4080 

GAGAAGGGGG CXZCAATGGCC AGGGAGTGAA GGAGGTGGCG TTGCTGAGAG CACSTCTGCAC 4140 

ATGCTTCTGT CTGAGTGCAG GAAGGTGTTC CaOGGTCGAA ATTACACTTC TOGTACCTGG 4200 

AGACGCTGTT TGTGGGAGCA CTGGGCTCAT GCCTGGCACA CAATAOOTCT OCAATAAACC 4260 
ATGGTTAAAT CCTGAAAAAA AAAAAAAAA 

Seg ID HO: 333 Protein sequence 
Protein Accession 8= NP_000011 



1 11 21 31 41 51 

I I I I I I 

MTLGSPHKGL LMIiLMALVTQ GDPVKPSRGP LVTCTCBSPH caCGPTCRGAM CTWliVREEG 
RHBOEaiRGOG NLKRELCRGR PTEFVNHYCC DSHLCNHNVS LVIjEATQPPS EQPGTDGQIiA 
LILGPVLAIiL ALVALGVLGL WHVRRRQEKQ RGLHSELGES SMLKASEQQ DTMLGDLLDS 
DCTTGSGSGL PFLVQRTVAR QVAIjVBCVGK GRyGEVWRGI. WHGESVAVKI FSSRDEQSWF 
RETBIYNTVL LRHONILGFI ASratTSRNSS TQLWLITHYH EHGSbYDFLQ RQTI.EPHLA1. 
RIAVSAACX3I. AHLHVEIPGT QOKPAIAHRD FKSRMVI.VKS HLQCCIADLQ LAVMHSQGSD 
YUtlCamPRV OTKRYMAPSV LDBQIRTDCF BSYKWTDIWA PGLVIiMElAR RTIVNGIV^ 
YRPPFYBWP NDPSPEDMKK WCVDQQTPT IPNRLAADPV LSGUQMHRE CHYPHPSARL 
TALRIKKXI« KISMSPraCPK VIQ 

Seqt lO MO< 334 DNA sequence 

Macleic Acid Accession »« 11K_004126.1 

Coding sequence: 108-329 

•1 11 21 -31 41 51 

I I I I I I 

GGCACGAOCT CGTGCCGGCC TTCJW3TTGTT TCGGGACQCG CCQAGGTTCO CCOCTCTTCC 
AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGQTTCTGQ GGCGAAAATG CCTGCCCTTC 
ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCX3CAAAG 
AAGTGAAGTT GCAGAGACAA CAAfiTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 
AAQAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 
AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 
AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAQ OAAGGAAGAA 
TOAAATTAAA AGGAGACTTT CTTAAGCACX: ATATAGATAG GGTTATOTAT AAAAQCATAT 
GTGCTACTCA TCTTTGCTCA CTATGCSM3TC TTTTTTAAGA GAGCRGAGAa TATCAGATGT 
ACAATTATGG AAATAAGAAC ATTACTTGAa CATGACRCTT CTTTCAGTAT ATTGCTTGRT 
GCTTC3UIATA AflfiTTTTGTC TT 



I I I I I I 

MPALBIEDLP EKEKLKMEVE QLRKEVKLQR QQVSKCSEEI KNYIEERSGE DPLVKGIPED 
KNPFKBKGSC VIS 



8eq ID NO: 336 DNA sequence 
Nucleic Acid Accession «: NM_005795 
Coding sequence: 5S5-1940 

1 11 21 31 41 51 

1 I I I I I 

GCACXyVGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAQGACCAT 60 

CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGGATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGCGGAATCT C3MSAAAGTAA AGTTCCATCC 180 

TCAGAATATT TCACAAAGAA TTTCCrTAAG AGCTGQACTG GGTCTreACC CCTGOAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATQTGATT TOAGTCTGGA 300 

GACAATTGTG CATATCGTCT AATAATAAAA ACCXaVTACTA GCCTATASRA AACAATATTT 360 

GAATAATAAA AACCCATACT AGCCTATACMV AAACAATATT TGAAAGATTQ CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTQ CTQCAAACTT CAATTGGTCA CCACAACTTG 480 

ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC 540 

ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGCCTTTTT 600 

TTATGATTCT TGTTACAGCA GAATTAQAAG AQAGTCCTGA GGACTCAATT CAGTTGGGAG 660 

TTACTAGAAA TAAAATCRTG ACBGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCAACA AOCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA 780 

ACGATGTTGC AOCAGGAACT OAATCAATOC AGCTCTGCCC TGATTACTTT CAGGACTTTG 840 

ATCCATCAOA AAAAGTTACA AASRTCIGTG AOCAAOATGO AAACTGQTTT AOACATCCAG 900 

CAA6CAACAQ AACKTOQACA AATTATAOCC AaTGTAATGT TAACACCCSU: OAGAAASTGA 960 

AGACKSCACT AAATTTCTTT TACCTOAOCA TAATTGQACA CaGATTGTCT ATTGCATCAC 1020 

TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAOTTGCCAA AGaATTACCT 1080 

TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACAATC ATTCACCTCA 1140 

CTGCSW3TGGC CAACRACCAG GCCTTAOTAG CCACAAATCC TGTTAGTTGC AAAGTGTCCC 1200 

AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAGGCS^TTT 1260 

ACCTACACAC ACTCATTOTQ GTGGCXX3TGT TTGCACaGAA GCAACATTTA ATGTGGTATT 1320 

ATTTTCTTGQ CTGGGGATTT OCACTCATTC CTQCTTOTAT ACATGCCATT GCTAGAAGCT 1380 



312 
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25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATATTACAA 
GCCCAATTTG 
TCATCACCAA 
GAGCTACTCT 
CTGAAGGAAA 
AGGGTCTTTT 
GAAQAAACTG 
TTCGTAGTGC 
GTCCTAGTGA 
CAGAAfiATTT 
AACrCAAGGA 
GGGAATGTCA 
ATCCAGCTCT 
CACTATGCCr 
ACAATCAACT 
AAATGGCTGT 
GACCTAGCTA 
TCCCATCTTG 
TAACTACCCT 
CTATQAAAAO 
ATCTTGTGGC 
TTCTATATCA 
TGTCTTACCA 
TCTACTGTAT 
ATTTTCTTGG 
TTTATTTTAT 
AATGCAACAA 
AATAGAGTCT 



MEKKCTLYPL 
EGVYCUHTWD 
NTNYTQCNVN 



TGCTQCTTTA 
C3TTAAAAQTT 

TATCTTGGTG 
GATTGCA6A6 
GQTCTCTACC 
GAATCaUVTAC 
GTCrTACACA 
ACACTTAAAT 
ATATAATTOA 
CTTGGACCCA 
TAAAGAAGAG 
ATGTGGGAAA 



TTTCTGAGCT 
AAAACTAAAC 
AGGTCTATAA 
ATTGGGGCAG 
CTCAAATGOA 
CAACTGA8TA 
ATATCCATTQ 
TTAGGAAAAC 
AACAGTGGGA 
AAACAAATTA 
AATTTTGTAA 
AGTCTCAAAT 
TGTGTGTATG 
GGAATGCT 



AAAATCCAAT 
GTGTCAACAA 
GGAAAAM3CA 
AAATAGAAGG 
TOACTCTGTA 
CCTTCACATG 
AAAGAAATCC 
TACTAACCTG 
GGTGTAAGCC 
ATACATGTTG 
ACATGAAGGQ 
TTGACTTTTT 



r CTGKTACCCA T 

: i-m"X " i - icii c 

3 CGGAATCCAA 
3 GCATTQAATT 
ACTACATCAT 
TCTTTAATGG 
TTGGAAACAG 
TCAGTGATGG 
TCCATOATAT 



GCCAOAAGAC 
AA ATTAO TAG 
TGGTTTGTAA 
ACATCACCAA 



GGCATGATTC 



CAATTGTTAT 
TGGAAACTGG 
ATCTTAGTTG 
GGGAATTCCT 



TTTTTTCCCA 
AGTIGAATTAT 
GATCTACTCA 



TGTGCTGATT 
GCACATCCTT 
AGAGGTTCAA 
CTTTTCCAAC 
TCCAGGTTAT 
TGAAAATGTT 
CACTGTTTGO 
TTCAATATTA 
TGTGTTGATA 
TGTTTGTCAG 
GTGTGGAATT 
CACCATTGAT 
TACCCTTATT 
TTTAGTTTTA 
OAGTGCCXSTA 
CCCTGCTGGC 



ATTATCCAIG 
GTACGCeTTC 
AAAGCTOTGA 
CCATGGOGAC 
ATGCACTTCC 
GCAATTCTGA 
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1620 
1680 
1740 



TCAGAAQCTC IBOO 



ATGCTACAAA 



AAAGAAATTG 



TTAATATCTG 



TATATAAAGA 
TGAAAAATGA 
CRACCTATGT 
ATACTGTATC 



TCTATAATAT 
ACACCTTGTC 
ATAAATTTTG 
AAATCAATGA 
GCTTGTAAAT 
AATTTTTAAA 
TGOGCTGATT 



AGTCATGACr 
CTCTTAAAAC 
TQCTTCTCCT 
AATGACTTTG 
AGA6T6TAAC 
TAAATACTCC 
GGAGAAAAGC 
GAATTCAAAC 
CSCCCCAAGA 
AAACTCTTTA 
GTCCTTTTTG 
TTTCTTTTCT 
CATC34GTTAT 
GCAATCTTAC 
AACCTCTTCC 
CCCTTCCATT 
AGGATTTCTT 
ACTCCATTAT 
GCAAATATAT 
TTTTAAATAA 



18G0 
1920 
1980 
2040 
2100 



2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



TIIHIiTAVAN 
HLMWYYFUa* 
NIVRVI.ITXL 
ILMHFQGLLV 



21 



TAELEBSPED BIQLSVTKSK 
GTBSMQI^D yFQDFDFSEK 
IiFYIaTIIGHO I.SIASIiI>ISL 
NQALVATNFV SCKVSQPIHL 
GFPIiIPACIH AIARSLTOJD 
tCVTHQAESNIi YMKAVRATI>I 



IWAVFAEKQ 
AI.I.VKLPFI.L 
AEBWDYIMH 

YTVSTISDGP GYSHDCPSEH LNGKSIHDIB IIVLIJCPBNI.Y H 

Seq ID NOs 338 DMA oequence 
Nucleic Acid AccesGlon #: NM_001795 
25-2379 



41 51 

I I 

IMTAQYECYQ KIMQDPIQQA 

VTKICDQDGN WFRHPASNRT 

GIFFYPKSLS CQRITLHKNL 

YLMGCNYFWM LCEGIYLHTL 

HCWISSDTHL tYIIHGPlCA 

I.VPLI.GIEFV LIPWRPEGKI 
QYKIQFGNSF SNSEAUtSAS 



Coding sequence: 



GCACGATCTG 
GCCTGCCTGG 
CGGGACaiCCC 



3 GAAGATGCAG 



TCAAGCGTGA 
TTCCGGGTCG 
ATCTCAGAGT 
ACTCCTTCCA 
CATCXjGTTGT 
GXOACAGCAG 



I 

GCCTGCTGGC 
ACAGCCTQCT 
TTGATGAAOA 
GTCGCAAGAA 
ATGCaGAQAC 
ACCACCTCAC 
GCTTCACCAT 
TCAATGCGTC 
TGGATGCAGA 
QOAAAOAGTA 



: GGGOGQACTC QGGCAOQGCC 
: CCTTCTTCAC CCAOACCAAQ 
C CTGTGGGCTC TCT6TTTGTT 
\ GCATCTTGCG GGGCGACTAC 
3 AGGGCATCAT CAAGCtXATQ 
\ TCGTCGAGGC CACAGACCCC 
\ GAGCCCAGGT CATTATCAAC 
r TCTACCACTT CCAGCIGAAG 



GCCCACCX»C CQGCGCCAAA AGAGAGATTG GATTTGOAAC 180 



TGCCAAGTAC 
AGOASACGTO 
TGCTGTCATT 
CAAAGTTCAT 
COTGCCTGAO 
CGACCCCACT 



TTCGCCATTG 
GTGGACAAGG 
GACGTGAAOG 
TCGTOGGCTG 



TACACATTTG 



CAGGACGCTT 
AAGCCTCTGG 
ACCATCX3ACC 
ATCACAGATG 
OAAAACCAGA 



ATCATOTAQG 
GAOAATATOT 
AGAGGCTGGA 
ACACTGtSTQA 
ACAACTGGCC 
TGGGGACCTC 
AOKXTCTGT 
GACGTATTAT 
TOSTGOAAGC 
TCACTCTGCa 
TOGTGCCTGA 
ATOAQCCCCA 
TCACCATTGA 
ATTATGAATA 
TCXXIATACAT 
TGGACGAJ3CC 
AGAAGCCTCT 



CAAGATCAAG 



t AAGTCTACCC 
: CCACAGGAAA 
3 CCCCGOAQTT 
: TGGTCCXGCA 

r tcacctkjaa 

I TCACAGTCAA 
S TCATCTCAGA 
r GGAASTaCMV 



CTGQTATAAC 
AGAATCCATT 
TGCCAAGCCC 
GATCrCCGCA 
TACTGAGAAC 
OTATGGGCAO 
CAATGGOATG 
CGAGCAGGGC 



GTGCAAGTCC ACATTGAAQT 
TACXaCCCCa AAGTGTGTGA 
ATAGACAAGG ACATAAC3VCC 
AACTTTACCC TCACGGATAA 
TTTGACCGGG AGCATACXSIA 
CCAAGTCGCA OGGGCAOCAQ 
G»STTCACCT TCIGC8AGGA 



CCGGGA6AAT 
AAACCTGGAG 
TGTGTTCACG 
AGTCATCTCT 
CATGTACCAA 
CACAATAAC6 
GOGAOATGCC 
AGACATCAAT 
AGACACCC3GT 
OAAOOGGATG 
GACAAACCCC 
CATCCAGCAA 
GAGCCCTCCC 
CCCCATTTTC 
GATTGGCACA 
CCX3CAGGACC 
TGAGAAA6AA 
ACTGGATTCC 
TTTGGATGAG 
GAACGCTGTC 



1030 
1140 
1200 



TCACGATAAC 
GGTCCACTTC 
CACGCTGACC 



CAGGTGGGOO TGA6CATCCA GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACA 



GTGATCAOCC 
AAGAGCGTGC 
ATGGACACCA 
CCXXCGCGGC 
AGGCACX3CX5C 
AAGQACGAGG 
TACGAGGQCT 



TGCTCATCTT CCTGaSGOGG 
OGGAGATCCA CGAGCAGCTG 
CCAGCTACGA TGTGTCGGTG 
CCGCGCTGGA CGCCCGGCCT 
CTGGGGCACA CGGAGGGCCC 



CCGAGTCCAT AGCOGAGTCC 



GTCACCTAC3G ACGAGGAGGG CGGCGGCGAG 
CTCAACTCGG TGCQCCGCGG CGGGGCCAAG 
TCCCTCTATG CGCAGGTGCA GAAGCCACCG 
GGGGAGATGG CAGCCATGAT CGAGGTGAAG 
CCCCXXTTACG AC3VCGCTGCA CATCTACGGC 
CTCAGCTCCC TGGGCACOSA CTCATCOGAC 



1920 
1980 
2040 
2100 
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TCTGACGTGG ATTACGACTT CCTTAAOGAC TGGGGACCav GGTTTAAGAT GCTGGCIGAa 2340 

CTGTACGGCT CGGACCCCCG GGAGGAGCTG CTGTATTAGG CGGCCX3AGGT CACTCTGGGC 2400 

CTGGGGACCC AAACCCCCTO CAGCCCAGGC CAGTCAGIWCT CCAGQCACCA CAGCXTTCCAA 2460 

AAATGGCAGT GACTCCCCAG CCCAGCACCC CTTCCTOGTG GGTCCCAOAO ACCTCATCAG 2S20 

CCTTGGGATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTC3V GTGATGACTA 2580 

TTCTCAAATG CTGGCAAATC CaVGGCTGGTC TTCTGTCTGG GCTCAGACAT CCACATAACC 2640 

CTGTCACCCA CAGACCGCCG TCTAACTCAA AGACTTCCTC TGGCTCCCCA AGGCTGCAAA 2V00 

GCAAAACAGA CTGTGTTTAA CTGCTGCAGG OTCTTTTTCT AGGGTCCCTQ AACGCCCTGG 2760 

TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTGGAGQCA AAGGCCTGQA CAGCTTOACT 2820 

TGTGGGGCAG GATTCTCTGC AGCCCATTCC CARGGGAGAC TGACCATCAT GCCCTCTCTC 2880 

GGGAGCCCTA OCCCTGCTCC AACTCCftTAC TCCACTCX3UI GTGCCCCAOC ACTCCCCAAC 2940 

CCCTCTCCAG GCCTGTCAAG AGGOAOGAAfi GGGCCCCATG GCAaCTCCM ' ACCTTQOQTC 3000 

CTGAAGTGAC CTCACTGGCC TGCCATGCCA GTAACTCTGC TGTACTGAGC ACTGAACCAC 3060 

ATTCAGGGAA ATGCTTATTA AACCTTGAAG CaUVCTOTGAA TTCATTCTGQ AGGGGCAGTG 3120 

GAGATCAGGA GTGACAGATC ACAGGGTGAG GGCX3VCCTCC ACACCCACCC CCTCTGGAGA 3180 

AOGCCTGGAA GAGCTGAGAC CTTGCTTTGA GACTCCTCAG CACXXXTTCCA GTTTTGCCTG 3240 

AGAAGGGGCA GATGTTCCCG GAaATCAOAA GACGTCTCCX: CTTCTCTGCC TOVCCTGOTC 3300 

GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTGGTT TAGAGGAACX: 3360 

CAAGATGTGG CCTTTAGCAR AACTGAC3UIT OTCCAAACCC ACTCATCACT GCATGAOQGA 3420 

GCCGAGCATG TGrCTTTACA CCTCGCTGTT GTCSlCATCrC AGGGAACTQA CCCTCAGGCA 3480 

CACCTTGCAG AAGGAAGGCC CTGCCCTGCC CAACCTCTGT GGTCACXSaX GCATCATTCC 3540 

ACTGGAACGT TTCACTGCAA ACACAOCTTG GAOAAGTGGC ATCAGTCAAC AGAGAGOGQC 3600 

AGGGAAGGAG ACAOCAAGCT CACCCTTCGT CATGGACCGA GGTTCCCACT CTGGCAAAQC 3660 

CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC 3720 

TTCTTATAAT GATTTTTTTA CTAAT6ATAC TTACAAQTTT CTAGCTCTCA CAGACATATA 3780 

GAATAAGGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAATTCAGGT 3840 

TTTTrAGTTG GAAAAACAAT TOCTGTAACC TTCTATTTTC TATAATTGTA GTAATTGCTC .3900 

TACAGATAAT GTCTATATAT TGGCCAAACT GGTGCATGAC AAOTACTGTA TTTTTTTATA 3960 
CCTAAATAAA GAAAAATCTT TAGCCTGGGC A 



1 11 21 31 41 51 

I I I I I I 

MQRUdMLIiAT SGACa.GIiIAV AAVAAAGAMP AQHDTHSI.I.P THRRQKHDWI WNQMHIDBBK 
NTSLPHHVGK IKSSVSRKNA. KXLLXQEYVG KVFRVBAETG DVFAIERUIR EMISEYHLTA 
VIVDKDTGEtT IiETPSSFTIK VBDVNDNWFV FTHRI.FIIASV PBSSAVGTSV tSVTAVDASO 
PTVGHHASVM YQILXaKEYF AIDNSGRIIT ITKSLDREKQ AK YE I WEAR DAQGLRGDSG 
TATVIjVTIiQD INDNFPFFTQ TKyXFWPED THVGTSVOSL FVBDPDBFQN RMTKYSIIiRG 
DYQDAPTIET HPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPACaUHAQVI 
INITDVDEPP IFQQPFYHFQ LXBNQKKPLI GTVIiAMDPDA ARHSIGYSIR RTSDKGQPPR 
VTKKGDIYNE KELDRBVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHIBVL DENDNAPEFA 
KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY 
GQPDREHTKV HFLPWISDN GMPSHTGTST LTVAVCKCNB QGEFTFCEDM AAQVQVSIQA 
WAILLCILT ITVITLLIFL RRRIiRKQARA HGKSVPEIHB QLVTTOEEGQ GEMD TTSVPV 
SVUfSVKItGG AKFPRPALDA RPSLYAQVQK PPBHAPGAHG GPGEMAAMIS VKXDBADBDQ 
OGPPYDTLMI YGrEGSESXA ESLSSLGTDS SOSDVDVDFL NDWGPRFKMZ. AELYGSDPRE 
EUaY 



Seq ID NO: 340 ONA sequence 
Nucleic Acid Accession 8: NM_003088 
Coding sequence: 112-1593 

1 11 21 31 41 51 

I I I I I I 

GCOGAGGGTG CGTGCGGGCC GCGGCAGCCG AACAAAGGAG CAGGGGCX3CC GCOjCAGGGA 60 
CCCGCCACCC ACXrrCCCGGG GCCGCGCAGC GGCCTCTCGT CTACTGCX»C CATGACCGCC 120 
AACGGCACAG CCQAGGCGCST GCAGATCCAG TTCGGCCTCA TCAACTGCQG CAACAAGTAC 180 
CTGACGGCCX3 AGGCGTTCGG GTTCAAQGTG AACGCGTCCO CCAGCAGCCT aAAOAAGAAa 240 
CAGATCTGGA CGCTGGAGCA GCCCCCTGAC GAGaCGGGCA GCGCGGCCGT eiGCCIGCGC 300 
AGCCACCTGG GCCGCTACCT GGCGGOGGAC AAGQACGGCA ACGTGACCTG CGAGCGCGAG 360 
GTGCCCGGTC CCGAcfrGCCG TTTCCTCATC GTGGCGCACG ACGACGGTCX3 CTGGTCGCTG 420 
CAGTCCGAGG CGCACOSGCG CTACTTCGGC GGCACCGAGG ACCX3CCTGTC CTGCTTCGCG 480 

CAGACGGTGT CCCCCGCCGA GAAGTG6AGC GTGCACATCG CCATGCACCC TCAGGTCAAC • 540 
ATCTACAGTG TCACCCGTAA GOGCTAOGOG CACCTGAGCG CGOGGCOSGC CGAOSAGATC 600 
GCXMTGGACC aaSACGTGCC CTGGGGCGTC GACTCGCTCA TCACCCTCGC CTTCCAGGAC 660 
CAGCGCTACA GCGTGCAGAC CGCOGACCAC CGCTTCCTGC GCCACGACG6 G CGCCTGQTG 720 
GCGCGCCCXX3 AGCCGGCCAC TGGCTACAOS CTGGAGTTCC GCTCOSGCAA GGTOGOCTTC 7 BO 
CGCGACTGCG AGGGCCGTTA CCTOaCGCOS TCGGOGCCCA GCGGCACGCT CAAQGOGaaC 840 
AAGGCCACCA AGGTGGGCAA GGACGAGCTC TTTGCTCTGa AGCASAGCTO CGCCCAGGTC 900 
GTGCTGCAGG CGGCCAACGA GAGGAACGTG TCCACGCGCC AGGGTATGGA CCTGTCTGCC 960 

AATCAGGAC6 AGGAGACCOA CCAGQAGACC TTCCAGCTGG AGATCXSACCG CGACACCAAA 1020 

AAGTGTGCCT TCOGTACCCA CACGGGCAAC TACTCGACGC TGAOGGCCAC CGGGGQCGTG 1080 

CAQTCCACCQ CCTCCAGCAA GAATGCCAGC TGCTACTTTG ACaTOSAGTG GCGTGACCGG 1140 

CGCATCACAC TGAGGGCGTC CAATGGCAAO TTrOTGACCT CCAAOAAGAA T6GGCA6CTG 1200 

GCCGCCTCGG TGGAGACAGC AGGGGACTCA GAGCTCTTCC TCATGAAGCT CATCAACCXJC 1260 

CCX3CTCATCG TGTTOCGCGG G6A6CAXGGC TTCATOGGCT GCCGCAAGGT CAOGGGCACC 1320 

CTGGAOGCXai ACOGCTCCAG CTATQACGTC TTCCAGCTGG AOTTCAAOGA TGGOOCCTAC 1380 

AACATCAAAG ACTCCACAGQ CAAATACTGG ACGGTGGGCA GTGACTCCGC GGTCACCAGC 1440 

AGCGGCGACA CTCCTOTGGA CTTCTTCTTC GAGTTCTGOG ACTATAACAA GGTGGCCATC ISOO 

AAGGTGGGCG GGCGCTACCT GAAGGGCGAC CACGCAGGCG TCCTGAAGGC CTCGGCGGAA 1S60 

AOGGTGGACC COGOCTCGCT CT6GGAGTAC TAGGGCCGGC CWGTCCTTCC CCGCCCCTGC 1620 

CCACATOGCO GCTCCTGCCA ACOCTCCCTQ CTAACCCCTT CTCCGCCAGG TGCGCTCCAG 1680 

GGCGGGAGGC AAOCCCCCTT GCCTTTCAAA CTGGAAACCC CAQAGAAAAC GGTGCCCCCA 1740 

CCTGTCGCCC CXATGGACTC CCCACTCTCC CCTCOOCCCG GGTTCCCTAC TCCCCTCGGQ 1800 
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TCAGCBGCTG CGGCCTGOCC CTGGGAOGGA TTTCAOATGC CtXTTGCCCTC TTGTCTGCCA 1860 

CGGGGOOAGT CTGGCACCTC TTTCTTCTSA OCTCAGACGG CTCTGAGCCT TATTTCTCTO 1920 

GAAGCXX3CTA AGKiSAOGGTT GGGGGCTG6G AGCCCTGGGC GTGTAGTCTA ACTGGAATCT 1980 

TTTGCCrCTC CCAGCCACCT CCTCCCAGCC CX:CX»GGAGA GCTGGGC3VCA TGTCCCAAGC 2040 

CTGTCAGTGG CCXITCCCTGG TGCACTGTCC CCGAAACCCC TGCTTGGGAA GGGAAGCTGT 2100 

CGGGAGGGCT AGGACTGACC CTTGTGGTGT TTTTTTGGGT GGTGGCTGGA AACAGCCCCT 3160 

CTCCCACX3TG GGAGAGGCTC AGCCTGGCTC CCTTCCCTGG AGCGGCAGGG CGTGACGGCC 2220 

ACAGGGTCTG CCCKCTGCAC GTTCTGCCAA GGTGGTGGTG GCGGGCGGGT AGGGGTGTGG 2280 

GGGCCGTCTT CCTCCTGTCT CTTTCCTTTC ACCXTTAGCCT GACTGGAAGC AGAAAATGAC 2340 

CAAATCRGTA TnrrXTTTAA TOAAATATTA TTGCTGGAGG OSTCCCAGGC AAGCCTGGCT 2400 

GTAGTAOCQA GTOATCTGGC GQGG6GCGTC TCAGCACCCT CCCX3\GGGGG TGCATCTCAG 2460 

CXKCCTCTTT COGTCCTTCC CGTCCAGCCC CAOCCCTGGO CCTGGGCTGC CGACACCTGG 2520 

GCCAGAGCCC CTGCTGTGAT TGGTGCTCCC TGGGOCTCCC GGGTGGATGA AGCCAGGGGT 2580 

OGCCCXXrrCC GGGAGCCCTG GGGTGAGCCG CCGGGGCCCC CCTGCTGCCA GCCTCCCCCG 2640 

TCCXX»ACAT GCATCTCACT CTGGQTGTCT TGGTCTTTTA TTrTTTGTAA aiGTCATTTG 2700 

TATAACTCTA AACGCCCATG ATASTAGCTT CAAACTGGAA ATAGCGAAAT AAAATAACTC 2760 

Seq ID NO: 341 Protein sequence 
Protein Accession » ; NP_003079 

1 11 21 31 41 51 

I I I I I I 

MTANGTAEAV QIQFGLIflCG NKYLTAEAFG FKVNASASSI. KKKQIWTLEQ PPDBAQSAAV 60 

CLRSHIiGRYL AADKIK2JVTC EHEVPGPDCR PLIVAHDDGR WSLQSEAHRR YFGGTEDRLS 120 

CFAQTVSPAE KHSVHIAMHP QVNIYSVTRK RYAKLSARPA DBIAVDiOJVP WGVDSLITIA 180 

PQDQRVSVQT ADHRPL8HDG RIjVARPEPAT QYTLEFHSGK VAFRDCEGRY LAPSGPSQTL 240 

KAOiCATKVGK DELFAI^QSC AQWLQAANB RMVSTRQ9(D LSANQDBETD QETFQIflDR 300 

DTKKCAFRTH TGKYWTI.TAT GGVQSTASSK KASCYFDIBW RORRITUIAS NOKPVTSKKK 360 

GQUU^SVETA GDSELPUOCIi IMRPIIVFRG BHOPIGCRKV TOTLDANRSS TOVFQLBFND 420 

GAYNIKDSTG KYMTVGSDSA VTSSGDTPVD PFFEFCDYNK VAIKVGQHYL KQDBAGVLKA 480 
SAETVDPASL WEY 



Seq ID NO: 342 DNA sequence 

Kucleic Acid Accession 0: FGENESH predicted 
Coding sequence i 660. .1705 

1 11 21 31 41 SI 

I I I I I I 

CGCTCCOCAC ACATTTCCTG TCX5CGQCCTA AGGQAAACIG TTGGCCGCTG GGCCCGOGGG 60 

GGGATTCTTG GCaUSrTGGGG GOTCCGTCGG GAGCOAGOaC GOAGOGGAAG GGAGGGGQAA 120 

COGGGTTGGG GAAGCCAGCT CTASAQGGOG GTGACCQCOC TCCAGRCRCA GCTCTGCGTC 180 

CTCGAGCGGG ACASATCCAA GTTGGGAGCA GCTCTGGOTG OGGGGCCTCA GAGAATGAGG 240 

COGGOGTTOG CCXTOTGCCT CCTCTGGC3U3 OCQCTCTGGC CCGGGCCaGG CGGCX3GCGAA 300 

CACCCCACTG CCGACCX3TGC TGGCTGCTCG GCCTCGGGGG CCTGCTACSUS CCTGCACCAC 360 

GCTACCATGA AGCGGCAGGC GGCCGAGGAG GCCTGCATCC TGCGAGGTGG GGCGCTCAGC 420 

ACXXSTGC33TG CGGGCGCCGA GCTGCGCGCT GTGCTCGOSC TCCTGCGGGC AGGCCCAGGG 480 

CCCGGAGGGG GCTCC3VAAGA CCTGCTGTTC TGGGTCX3CRC TGGAGCGCAG GCGTTCCCAC 540 

TCCACCCTGG AGAACGAGCX: TTTGCGGGGT TTCTCCTGGC TGTCCTCCGA CCXXBGCGGT 600 

CTOGAAAGOG ACACX3CTGCA GTGGGTGGAG GAGCCCCAAC GCTCCTGCAC CGCQCGGAGA 660 

TGOGOG8TAC TCCAGGCCAC CQGIGGGGTC GAGCCCGCA6 CTGGAAGGAS ATQCGATGCC 720 

ACCTGCGOGC CAACGGCTAC CTGTGCAA6T ACCASTTTQA GGTCTTCTGT CCTGCGCOSC 780 

GCCCCGGGGC CGCCTCTAAC TTQAGCTATC GCGOGCCCTT CCAGCTGCAC AGCQCCX3CTC 840 

TGGACTTCAG TCCACCTGGG ACCGAGGTGA GTGCGCTCTG CCGGGGACAG CTCCaSATCT 900 

a«3TTACTTG CATCGCGGAC GAAATCGGCG CTCGCTGGGA CAAACTCTCQ GGOSATGTGT 960 

TGTGTCCCTG CCCCGGGAGG TACCTCCGTG CTGGCAAATG CWCftGftGCTC CCTAACTGCC 1020 

TAGACGACTT GGGAGGCTTT GCCTGCGAAT GTGCTACGGG CTTCX3AGCTG GGGAAGGACG 1080 

GCCGCTCTTG TGTGACCAGT GGGGAAGGAC AGCOSACCCT TGGGGGGACC GGGGTGCCCA 1140 

CCAGGCGCCC GCCGGCCACT GCAACCAGCC CCGTGCXGCA GAGAACATGG CCAATCAGGG 1200 

TGQAGGASAA aCTGGQAGAG ACACCACTTQ TCCCTGAACA AQACAATTCA GTAACATCTA 1260 

TTCCTGAOAT TCCTOSATGa GGATCACAOA GC3«3QATGTC TACCCTTCAA ATGTCCCTTC 1320 

AASCCGAQTC AAASOCCACT ATCACCCCAT CAGGGAGOQT GATTTCCARG TTTAATTCTA 1380 

CQACTTCCIC TQCCACTCCT C3«3GCTTTCa ACTCCTCCTC TGCCGTGGTC TTCAT ATTrS 1440 

TGAGCACAGC AGTAGTAGTG TTGGTGATCT TGACCATGAC AGTACTGQGG CTTGTCAAGC 1500 

TCTGCrrrcA CXJAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGCOGGGCC 1560 

TGGAGAGTGA TCCTGAGCCC GCTGCXTTGG GCTCCAGTTC TGCACATTGC ACAAACAATG 1620 

GGGTGAAAGT aGGGGACTGT GATCTGOGGG ACAGA8CAGA GGGTGCXrTTG CTGGCG6AGT 1680 
CCCCTCTTGG CTCTAGTGAT GCATAG 

Seq ID NO: 343 Fzotein sequence 
Protein Accession 9t 

1 11 
I I 

MGKDPMTKTP KAFATKAKID 
lAHIYKEMQ lYICKKKPTKT 
GbGKPAV&QG DRAPOTAURP 
PHCRPCWLIiG LGGUKIPAPR 
RRGIiQRPAVL GRTQAQAFPL 
RGTPGHRWOR ARSWKEMRCH 
DPSPPGTEVS A1X31GQLPIS 
DDLGGPACEC ATGPEIjGKDG 
DEia-GETPLV PEQDMSVTSI 
TSSATPQAFD SSSAWFIFV 
ESDPEPAAIjQ SSSAHCnntG 



FGEHESB predicted 

21 31 41 51 

I I I I 

KWDLIKLKSF CTAKETIIRV NSQPTDWQKT FAIYPSDKGV 60 

LRTHFLSRPK CTICHPLGPHG DSWQUJGPSQ ARAEGKOGQT 120 

RAGQXQVGSS SAGGASGMEA GVRPVPPLAQ ALARAGRRRT 180 

YHEAAGGRGG LHPARWGAQH RAOGRRAARC ARAPAGRPRA 240 

HPGBRAFAGF LLAVUIFRRS RKRHAAVGGG APTLUIRAEM 300 

UtAHGYLCKY QFEVLCPAPR PGAASNLSYR APFQIJISAAL 360 

VTCIADEIGA RWDKLSGDVI. CFCPGRYLRA GKCAELPNOi 420 

RSCVTSGBGQ PTLGGTGVPT RBPPATAT8P VPQRTWPIRV 480 

PBIPRWGSQS TMSTIiQMSIO AESKATITPS OSVISKFMST 540 

STAWVLVII. TMTVLGLVKL CFHESPSSQF BKESMGPPGIi 600 
VKVGDCDIiRO RABSALLABS PLGSSDA 
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Seq ID HO: 344 DHA sequence 
Nucleic Acid Accession ft: NM_0X20"72 
Coding sequence: 149-2107 



102Q 



1260 
1320 
1380 
1440 



1 . 11 21 31 41 51 

AAAGCCCTCA GCCTTTOTGT CCTTCrCTGC CSCCGCSAGTGG CTGCRGCTCA CCCCKaOCT 
CCCCTTGGGG CCCAGCTGGG AGCX^GAGATA GAAGCTCCTG TCXJCCCXn-GG GCTTCTCXXX: 
TCCCGCAGAG GGCCACACAG AOACCXSGOAT OGCCACCTCC ATCGGCOTGC TGCTGCrGCT 
GCTGCTGCrC CTGACCC3M5C CCGGQGCQGG OACGGGRGCT GACACGGAGG CGGTGGTCTG 
CGTGGtSGACC GCCTGCTACA CGGCCCACTC OGGCAAGCTG AGCGCTGCCX3 AGGCCCAGAA 
CCACIGCAAC CAGAACGGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 
OOTCCAGCGA GTACTGGCCC RGCTCCTGAa GCGGGAGGCA GCCCTGACX3G CGAGGATGAG 
CAAGTTCTGG ATTGGGCTCC AGCXSAGAGAA GGGCAAGTGC CTGGACXCTA OTCTGCCGCT 
GAAGGGCTTC AGCTGGGTGG GCGGGGGGGA GGACACGCCT TACTCTAACT GeCACAAGOA 

GcrrccGGAAc tcxstgcatct ccaagcgctg tgtgtctctg ctgctggacc totoccaocc 

GCTCCTTOCC AACCGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGQCTCCCC 
COQAAGTAAC ATTOAGGGCT TCGTCTGCAA GTTCftGCTTC AAAGGCATGT GCCGGCCTCT 
GGCCCTBGGQ GGCCCAGGTC AGGTGACCTA CACCACCCXX: TTCCAGACCA CCAOTTCCTC 
CrCGQAGGCT GTGOCCTTTQ CCTCTQCGGC CAATCTAGCC TGTGGGGAAG GTGAC3UVGGA 
OGAGACTCAG AGTCATTATT TCCTGTGCAA GOAGAAGQCC CCOQATGTOT TCGACTGGGG 
CAGCTCBGGC CCCCTCTGTG TCAGCCCCAA GTATGGCTGC AACTTCAAC3V ATGGGGQCTG 
CXXCCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGCGGCTGCC GACCAGGATT 
CCGGCTGCTG GATGACCTGG TGACCTGTGC CTCTCQARAC CCTTGCAGCT COflOKMG 1080 
TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACOT GCCGCTGCCX: 1140 
CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GACGTGGATO AATGCCAGGA 1200 
CTCCCCCTGT GCCCAGGAGT GTGTC»ACAC CCCTGQOSGC TtCCXJCTGCG AATGCTOGGT 
TOQCTATOAG CCGGGCGGTC CTGGAGAGGG GGCCTGTCAG GATGTGGATQ AGTGTGCTCT 
GGGTC»CTCG CCTTGCGCCC AGGGCTGCAC CAACACAGRT GGCTCATTTC ACTGCTCCTG 
TGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGACX5 TGGATGAGTG 
TOTGGGCCCG GGGGGCCCCC TCTQCGACAG CrrGTGCTTC AACACACAAG GOTCCTTCCA 1500 
CTGTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CXATGGGGCC 1560 
TGTCTCTCTG GGACX^CCAT CTGGGCCXXrC CGATGASGAG GACAAAOGAG AGAAMAAGG 1S20 
GAGCACCGTG CCCOSCGCTG CAACAGCCAG TCCCACAAGG GGCOCCSAGO GOjCCCOCAA 1680 
QGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC "40 
ACTCAAGATG CTQQCCCCCA GTGGGTCCTC AGGCGTCTGG AGGOBGCCCA GCATCCATCA 1800 
CGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA 1860 
AAACAACGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCCTAG GCACCGTGGT 1920 
GGCCATCCTA CTCCTGCTGG CCCTGGCrCT GGGGCTACTG GTCTATCGCA AGaSGAGAGC 1980 
GAAGAGGGAG GAGAAGAAGQ AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT 2040 
TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACX»GTAC AGTCCQACAC CTGGGACAGA 2100 
CTGCTGAAAG TGAOSTQaCX: CTAGAGACAC TAGABMACC AlSCCACCATC CTCAQAGCTT 2160 
TGAACTCCCC ATTCC3VAAGG GGCACCCACA TTTTTTTQAA AOACTGBACT GGAATCTTAG 2220 
CAAACRATTG TAAQTCTCCT CCTTAAAGGC CCCTTGQAAC ATQCAGOTAT TTTCTACGGG 2280 
TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGOSTGC CACGGTGGGG ATTTCGTGAC 2340 
TCTATAATGA TTGTTACTCC CCCTCCCTTT TCaAATTCCA ATGTGACCAA TTCCGGATCA 2400 
GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCXTTGAATAT CTTCTCTGCT CACTTCCACC 2460 
ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATCATT TG TTTC TCTT 2520 
CCTAGGATGA AAACIAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTTTGG 2580 
TCAAAGGGAA CATGTTCGGA CTGQAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC 2640 
AGCACAAGTC TTGCTAAATG TCATACTOTT GACATCCTCC AGAATGGCCA GAAGTGCAAT 2700 
TAACCTCTTA GOTOGCAAQG AOGCAGGAAG TGCXTTCTTTA GTTCTTACAT TTCTAATAGC 
CXTGGOTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCTTG AAGTGCATTA 
CAGGTGrrtG TGAAGTCACA TAATCTACGG GQCTAGGGCG AGAGAGGCCA GGGATTTGTT 
Ca«3M3ATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACCACaCACT TGACTAOGGA 2940 
TGTGATCAAC ACTflACAAGG AAACAAATTC AAGGACAACC TGTCTTTOAG CCTGGMMQ 3000 
CCTCAGACAC CCTCCCTOTG GCCCCGCCTC CACTTCATCC TGCCCX3GAAT GCCAGTGCTC 3060 
CX3AGCTC3VGA CAQAGGAAGC CCTGCauSAAA GTTCCATCA6 GCTGTTTCCT AAAGGATGTG 3120 
TOAACGGGAG ATGATGCACI GTGTTTTOAA AGTTGTCATT TTAAAGCRTT TTAGCACAGT 
TCATAGTCCA CAflTTQATGC AGC3VTCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGCGCA 
CACACCAAGT AGGGAQCTAQ TCRGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTQTCTCTT 
TTCCTTAAAA TrGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA 
TTTTTACAGC AAAAACTGCT CAAAGCCATT lAAATTATAT CCTCftTTTTA flAflGTTACM 3420 
TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 3480 
TCTCrCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTO 3540 
CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CAOAGTCACr AGAABTTACC 
TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTGG6CTTTT TACCACCACT GTGCAGQAQA 
ACAGACRGAQ GAAATOTGTC TCCCTCCAAG GCCCC31AAGC CTCAGAGAAA GGGTGTTTCT 
GGTTTreCCT TAOCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 
CaAGGTGCAG GQTTAATACI CTTGCCAGTT TTGAAATATA GATGCXATGG TrCAGATTGT 
TTTTAATAQA AAACTAAAGQ GGCAGGGGAA GTOAAAGGAA AOATGGAGGT TTrGTGCGGC 
TCGATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAaTTGG 
AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCTGTGT CC ATTTGGC A AAACTTCCTT 
GGCCACGAGA CTCTAGGTGA TCTOTGAAGC TGGGCAGTCT GTGGTOt^ t^.n 
CTGTCTGGCX: ATTCAGAGGA TTCTAAAGAC ATCGCTGQAT GOeCTGCTaA CCAACATC3VG 4140 
CACTTAAATA AATCCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCXXTTTA 4200 
TCATTTGGGG TGAAOOASAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CXSCAGTCTGT 4260 
GTAIGATTCC TGGGATCCRA CBAGCCXTTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 
aOCCAGGOCC ATCGTCTGrr CTCTGAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATO 4380 
6AACCCCTCT GTQGRACCCA CAAGGGGAGA AATGGGTGRT AAAGAATCCA GTrCCTCAAA 4440 
ACCTTCCXrrG GCAGQCTGGa TCCCTCTCCT GCTGGGTGGT GCT TTCTCTT GCACACX»CT 4500 
CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CAGOTTGIGC A TCTOAT GGA 4560 
AACCACTCGG CTCAAACACG TGCrTTATTC TCCTGTTTAT TTTTGCTGTT ACmOAAaC 4620 
ATGGAAATTC TTGTTTCGGG GATCTTGGGG CTAC»GTAGT GGGTAAACAA ATOCCC3VCCQ 4680 
GCCAAGAGGC CATTAACAAA TOGTCCTTGT CCTGAGGGGC aXaGCTTOC TCGGQ0GTG6 4740 
C3VCAOTGGGG AATCCAAGGO TCACAGTATG GGGAGAGGTG CACCCTOCCA CCTOCTAACr 4800 
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• TCTCGCTAGA CACAGTGTTT 
CATGGGGACX3 GGGORAGTTT 
CCAAATAGGT CAATWVTTCT 
TCTCTCCCTC CCCTCATCCC 
OVCCCAGCTC QCCATGCCTA 

CTTcrrrGTC atttgagaaa 

CAGAAAAACX AGGGCAGGAC 
TAGAGC3GACT CCACCCCTGC 
CTCTGCCTTC GGTGGCCXAC 
AACACATCTA CGTGTAGCAC 
AGGCTCTGAT TAAQOATGTQ 
CTGGAG6CCT GTCTOTTAQC 
TGCCATCrrC CCTGCOATCA 
TGTGTTATGT CCATTTTGCR 
TTGCTTTGTC TTTTCXUiTCC 
TCTTTC3GATG GATGGAGATG 
ATTTCTGTGA AAACTAGGAG 
TAACAC3W5TC TTTTTAAAAC 
TCTCCATTGT CTAAATCAGQ 
TTAATGCCCC CIACATATTT 
TATOATCCCA GAAAACATCT 
TGGTTTOTGC ATTTTCTCAA 
CTAATCAAAG ACACTATTTT 
TTTAAAAATA AATTGTGTTT 
AAGCTCTGGA ATCCCTTTAT 
TTCAGCAGAT TTTGCCCACT 
GCTTCAATT& GATCCCTGCA 
GTAATCACIT CATGAATGCT 
TTTGTTTGAC TAATTCTOQA 
ATGTTTCCCA AACTGTGAGG 
CAAAAT6QTQ CTTTGAGGGT 
TCTOTTATCr GCCTATCCTA 



PCT/US02/12476 



CTGCCCAGGT 
TCACTTGGAG 
GGGAGACTCT 
ACATCTCAAA 
CTCATTCCTG 
GOATGCAGGA 



QACCTGTTCA 
ATGGACACCA 
TGGAAAAAAC 
GCAGACAATG 
AATTTCAGGT 



AATGCATTAO 
GGCTTCCAGG 
GTCATCGTCA 



GCCATCACTG 
CACAGATAAT 
AACTTGGTGA 
CSkAGACCAAC 



GATTTGTTGT 
CAQGACCMC 
ACATCTCACA 
CTCTTTCTTT 
CTGAGGAATG 
GCATCCTCTG 
CACATCTGGT 
ATCATQATCiC 
GAACTGCATG 



TCAACAGCTT 
ACACCTAAQC 
TAOGAGGTTA 

GGGAAGTGGG CTGCGGTCAC TGTOGGCCTT GCAAGGCCAC 
GCCAOCCACA 
TTTATATGCA 
TCTCTTCAAG 
CCCTGAGCAA 
TCCTGTAAAT 
GAATTAAAAA 
CTTCTTAGCT 
GCAAAATGGT 
OAATCTGAQA 
TAAATCTATA 



GGATGAACTG 
TCATCACAAG 
ATCATTAGGT 
AACAGAGATQ 
TAACATAGGA 



TGGAATTAAA 
AGTTTAAAAG 

CCCTTG' 



ACTTTrGTTT 
AGATTTGACA 
AAGCCTTTCC 



GTCTCTACTT 
CTAAAAATA6 
CATACTAGAT 
TGGTCTGTTC 
TGTGCTGTTG 
ATTCCTCTGA 



AAGTCAAACC 
AATTTTTTTT 
AGTGTCTTAT 
CAACCTTTAT 
AAAAAAAATT 
TATTATTTCT 
TCTAGCAGCT 
TTTAQCTTCSO 



AOATGATAAT 
TCCTGAGACA 
TTGTAGATAA 
CTCTTATCra 
GCTGAAGTTC 
CTGTGATGTC 
GTAAGTATTT 
TTCTATGCAS 
CAGMiaTOGA 



CCQAATTCTC 
AATACTCACT 
TGCCCTTCTA 
CAAGQTGGCA 
TTTGCATAGA 
AGATGTAATT 
TTAAATQTGT 
GATTTACCTT 



GAAGGGCTTO 



OTATTTCAAA 



4860 
4920 
4980 
5040 
5100 
5160 
5220 

5340 
5400 
5460 
5520 

ssao 

5640 



5880 
S940 
6000 
6060 
6120 

€240 
6300 
6360 
6420 
' 6400 
6540 
6600 



3 GAAGGTGCAG CTTTGTTGTC C 



40 
45 
50 
55 
<50 
65 
70 
75 
80 
85 



LLLIaLTQPGA 
HVQRVLAQU. 
ELRNSCISKR 
LALGGPGQVT 




1/GLLVYRKRR 



CACAGCGCCC 
ACTTCX3GGGQ 
AGCCXMGACC 
CGCAAGATGG 
GCCGAGTGGA 
CTGCGAGCGC 
AAQACGCTCA 
AATAGCATGG 
ATGOACAGTT 
CAGCTGGGCT 
ATGCACCGCT 
ATOAACXSGCT 
CITGGCTCCA 
TCTTCCTCCC 



I I I I 

GTGADTEAW CVGTACYTAH SGKLSAABAQ NHCNQNOGBIIi 
KGKCLDPSLF UCGFSHVGGG 
SEGPCGSPGS PGSNIEGFVC 
ANVACGEGDK DETQSHYPLC 
GSFLCX3CRPG FRIiLDDLVTC 
LDCVDVDECQ DSPCAQECVN 
LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 
SIiCFNTOGSF KCXiCLFGHVIt 
SPTRGPESTP KATPTTSRFS 
PAGGDSSVAT 
PCBIAADSYSH 



CVSLLLDLSQ 
YTTPFQTTSS 
KYGCNFNNGG 
PHGKNVTCRC 



GCATGTACAA 



GCQTCAAGOJ 
CCCAGGAGAA 
AACTTTTOTC 
TGCACaTGAA 
TQAAGAAGGA 
CGAGCGGGGT 
ACXSCXSCACAT 
ACCCGCAGCA 
ACGAC6TGAG 
CGCCCRCCTA 



CATGATGGAG 
CAACTCCACX: 
GCCCATGAAT 
CCCCAAGATG 



GGAOAOGGAG AAGOSGCCQr TCATOCSACQh GGCTAAGCGO 



CAGCATGTCC 
GGTCAAGTCX: 
rrOGCAG 



: CXWACAQCQA 



GCOGQAACCC 
GCCOGGCACG 
ACTGGAGGGO 
AGAGTAAOAA 



CACAACTOQG 
AAGOSGCCQr 
GATTATAAAT 
CTGCCX^GGOO 
GCXSSGCXrrGG 
AGCAACGGCA 
AATGCSQCAGG 
TACAACTCCA 
TACTOGCAOC 
QAGGCCAGCT 
GCOGGGGACC 



A6ATCAGCAA GCGCCIGGGC 240 



TAAGTACACG CTGCCX^GGOO QGCTGCTGGC CCCCGGOGGC 420 

GCX3CGGGCGT GAACCAGCGC 480 

GCTACAGCAT GATGCAGGAC 540 

GOGCAGCGCA GATGCftGCCC 600 

TGACCAGCTC GCAGACCTAC 660 

AGGGCACCCC TGGCATGGCT 720 
CCAQCCCCCC TGTOaiTACC 



TOSGGGACWT GATCAGCKIG 840 



ACAQCATGGA GAAAACXXXSG TACXCTCAAA 



1 11 21 31 41 51 

1 1 1 t I I 

RSABMyNMMB TELKPPGPQQ TSGGGGQIST AAAAGGHQKH &PDRVKRFf4N AFHVWSRGQR 
RKMAQKNPKM HNSEISKRLG AEWKLLSETE KRPFIDEAKR LHALHMKEHP DVKYRPRRKT 
KTLMKKDKYT LPGGLLAPGG NSMASGVGVG AGLGAGWQR MDSYAHMNGW SNGSYSMMQD 
QLGYPQHPGL NAHGAAQMQP MHRTOVSALQ YNSMTSSQTY MHGSPTYSMS YSQQGTP04A 
LGSMGSWKS EASSSPPWT SSSHSRAPCQ AGDLRDMISM YLPQAEVPBP AAPSRLHMSQ 
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Seq ID NO: 348 DMA sequence 
Nucleotide Acceaslon «i NM_002638 
Coding sequence: 120-473 

1 11 21 31 41 51 

I I I 1 I I 

CAA.TACAQCT AAGQAATTAT CXICTTGTAAA TACCACAGAC CXa3CX:CTGGA GCCAGGCCAA 
GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 
TGAGGGCCAG CAGCTTCTTG ATOSTOGTSG TGTTOCTCaT CX3CICGGACG CTCGTTCTAG 
AGQCAGCTGT CAOQGGAGTT CCTGTTAAAG GTCAAOACAC TOTCAAAGGC CBTGTTCCAT 
TCAATGOACA AGATCCCGTT AAAGGACAA6 TTTCAGTTAA AG6TCAAGAT AAAOTCAAAO 
OGCAAGAGCC AGTOVAAGGT CCAGTCTCCA CTAAOCCTGG CTCCTGCOCC ATTATCTTCA 
TCCGGTGCGC CATGTTGAAT CCCCCTAACC OCTGCrTQAA AGATACTGAC TGCCCAGGAA 
TCAAGAAGTG CTGTGAAGGC TCTTGOGGGA TGGCCTGTTT CXJTTCCCCAG TGAAGGGAGC 
CGGTCCTTGC TGCACCTGTQ CXOTCCCCAG AQCXACAGQC CCCATCTQQT CCTAASTCCC 
TGCTGCXXTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAOQATa CCCAOQGCTG 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 



I I I I I I 

MRASSFLIW VFLIAGTLVL EAAVTGVPVK OQDTVKGRVP FNGQDPVKGQ VSVKGQDKVK 
AQEPVKGPVS TKPGSCPIIL IRCAMUIPFK RCIiKDTDCPa IKKCCEGSOQ MACFVFQ 



Seq ID NO: 3S0 ONA sequence 
Nucleic Acid Accession ft: NM_007ia3 
Coding sequence: 75-2468 

1 11 21 31 41 51 

I I I I I I 

GAATTCCGGA CAGGACGTGA AGATAGTTGG GTTTGGAGGC GGCCGCCAGQ CCCAGGCCOG 60 

GTGGACCTGC CGCCXTGCAG GACGGTAACT TCCTGCTCfTC OGCCCTGCAa CCTGAGGCOQ 120 

GCGTGTGCTC CCTGGCGCTG CCCTCTGACC TGCaGCTGOA CCGCCGGGQC GCCGAGGGQC 180 

CGGAGGCCGA GCGGCTGCGG GCAGCCCGCG TCCRGGAGCA GGTCCX3CGCC CGCCTCTTGC 240 

AGCTGGGACA GCAGCCGCGG CACAACGGGG COGCTGAGCC CGAGCCTGAG GCCGAGACTG 300 

CCAGAGGCAC ATCOWSGGGG OMSTACCACA CCCTGCAOGC TCGCTTCAGC TCTCX3CTCTC 360 

AGGGCCTGAG TGQGGACAAG ACCTCGGQCT TCCMGCCCAT CQCXaAGCCXJ GCCTACaVGCC 420 

CAGCCrCCTG GTCCTCCCGC TCCGCXXTCQ ATCTGAOCia CAOTOGGAGG CTGAGTTCAG 480 

CCCACAATG6 GGGCAGCGCC TTTGGQGCOG CTGGGTAOGQ GGGTGCCCA6 OXACCXXTC 540 

CCATGCXraVC CAGGCCCaSTG TCCTTCCATG MSOBCBOIGB GOTTGGGAOC OGOGCXXKCT 600 

ATGACACACT CTCCXnX3CXSC TCKCTGCXSGC IGGGGCCOGG GGGCCIGGAC GACC GCTACA 660 

GCCTGGTGTC TGAGCAGCTG GAGCC0GCXK3 CCACCTCCAC CTACAGGGCC TTTOOGTACG 720 

AGCGCCAGGC CAGCTCCAGC TCCAGCCGGG CAGGGGGGCT GGACTGQCCC GAGGCCACTG 780 

AGGTTTCCCC GAGCCGGACC ATCXXSTGCCC CTGCCGTGCG GACCCTGCAG CGATTCXa«3A 840 

GCAGCCACCG GAGCCGCGGG GTAGGCXSGGG CAGTGCCGGG GGCCGTCCTG GAGCCAGTGG 900 

CTcbAQCGCC ATCTGTGCOC AGCCTCAGCC TCAGCCTGGC TGACTCGGGC CACCTGCXCG 960 

AOGTOCATGG GTTCAACAGC m«SSGTAGCC ACOGAACXXT GCAGASACTC AQCAGCGOTT 1020 

TTGATOACAT TGACCTGCCC TCAOCAGTCA AGTACCTCAT GGCTTCAGAC CCCAACCTQC 1080 

AGGTGCIGGG AGOGGCCTAC ATCCAGCACA AGTGCTACAa OGATGCAGCC GOCAAGAAQC 1140 

AGGCCCGCAG CCTTCAGGCX: G7QCCTAGGC TGGTGAAGCT CTTCAACCAC GCCAACCAGG 1200 

AAGTGCAGCG CCATGCCACA GGTGCCATGC GCAACCTCAT CTACGACAAC GCTGACAACA 1260 

AGCTGGCCCT GGTGGAGGAG AAOGGGATCT TCGAGCTGCT GCGGACACTG CGGGAGCAGG 1320 

ATQATGAGCT TCGCAAAAAT GTCACAGGGA TCCTGTGGAA CCTTTCATCX: AGCGACCACC 1380 

TGAAGGACCG CCTGGCCAGA GACACGCTGG AGCAGCTCAC GGACCTGGTG TTGRGCCCCC 1440 

TGTCaSGGGGC TGGGGGTCCC CCX:CrCATCC AGCAGAACX5C CTOSGAGGaS GAGATCTTCT 1500 

ACAAGGCCAC OGGCTTCCTC AQGAACCTCA GCTCAGCCTC TCAGGCCACT COCCAGAASA 1560 

TGOGGGASTG CCAOGGGCTG GTGGAC6CCC T6STCACCTC TATCAACCAC GGCCTGOAGQ 1620 

C6GGCAAATG CQAGQACAAG AGCXTCGGAGA AOGOGBTO T G OGTCCTGOaO AACCTOTCCT 1680 

ACCGCCTCTA CGAOAGATG C0GCXX3TCCG CXSCTGCAGCa GCTGGAGGGT OGOGGCCGCA 1740 

GGGACCTGGC GGGGGCGCCXS CCGGGAGAGG TCGTGGGCTG CTTCACGCCG CAGAGCCGGC 1800 

GGCTGCGCGA GCTGCCCCTC GCCXSCCGATG CGCTCACCTT CGCGGAGGTQ TCCAAGQACC 1860 

CCAAGGGCCT CX3AGTGGCTG TGGAGCCCCC AGATCGTGGQ GCTGTACAAC CGGCTGCTGC 1920 

AGOGCTGOGA GCTCAACCGG CACACGACX3G AGG0GGCC5GC CGGGGOGCTG CAGAACATCA 1980 

CXSGCaGGOGA CXXKaGGTGG GOGGGGGTGC TGAGCCGCCT GGCCCTGOAG CAGGAGCGTA 2040 

TTCTGAACCC CCTGCTAGAC CGXGTCAGGA CCX3CCGACCA CCACCAGCTG CHCTCACTGA 2100 

CTGGCCTCAT CCGAAACCIQ TCTCXSAACG CTAGQAACAA QOACGAGATG TCCACGAAGG 2160 

TGGTQAGCCA CCrOATCQAO AAaCTOOCAG aC3\G0GT6Ga TGAGAAGTCG CCXXX:A0CCG 2220 

AGGTGCT G GT CAACRTCATA GCTGTGCrCA ACAAOCTGei GGTGGCCAGC CCCATCGCTG 2280 

CCCSGAGACXrr GCTGTATTTT GACGGACTCC GAAAGCTCAT CTTCATCAAG AAGAAGCGGG 2340 

ACAGCCCCX3A GAGTGAQAAG TCXTTCCXXJCG C»GCATCCAG CCTCCTGGCC AACCTGTGGC 2400 

AGTACAACAA GCTCCACOGT GACTTTOGGG C!GAAGGGCTA TCGGAAGGAG GACTTCCTGQ 2460 

GCCCATA6GT GAAGCCTTCT GGAGGAGAAG GTGACGTGGC CCAGOGTCCA AGGGACAGAC 2520 

TCAGCTCCA6 GCTX3CTTGGC AGCCCAGCCT GGAGGAGAAG GCTAATOAOG QAGGGGCCCX: 2580 

TOGCTGGGQC CSa^IGTGTGC ATCTTTGAGG GTCCTGGGOC AOCAQGAGGO GCAGGGTCTT 2640 

ATAGCTGGGa ACTTGGCTTC CGCAGGGCAG GGGGTGGGaC AGQGCTCAAG GCIGCTCTX3G 2700 

TGTATOGGGT GGTSICCCAO TC31CATTGGC AGAGGraGGG GTTGGCTQTQ QCCIOaCAaT 2760 

ATCITGGGAT AGCCAOCACT GGQAATAAAG ATGGCCMOA AC3U3ICAC3A AAAAAAAAAA 2820 
AAAAGGAATT C 

Seq lO NO: 351 Protein sequence 
Protein Accession « : MP_009114.1 

1 11 21 31 41 51 
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45 
50 



WO 02/086443 PCT/US02/12476 
I I I i I 1 

HQDGtl7U.SA LQPEAGVCSL AI.PSDIiQU>R RGAEGFEAER UtAARVQEQV RASI.LQI'SQQ 60 

PRHNOAAEFE FEAETARGT8 ItGQYHTI.QAG FSSRSQGLS6 OKTSGFRPIA KPAYSFASWS 120 

SRSAVDLSCS RRI.SSAHNGG SAFGAAGYGG AQPTPPMPTR PVSFKBRGGV GSRADYim.S 180 

5 liRSLRLGPGG UJDRYSIjVSE QI.EPAATSTY RAFAYBRQAS SSSSRAGGI.D WPEATEVSPS 240 

RTIRAPAVRT LQRFQSSHHS RGVGGAVPGA VI^;PVARAPS VRSLSLSLAD SGHM-OVHOP 300 

NSYGSHRTLQ RLSSGFDDID IiPSAVKYLMA SDPNLQVLGA AYIQHKCYSD AAAKKQAHSL 360 

QAVPRLVKLF NHANQEVQRH ATQAMRMIiIY DHAONKIALV EENGIFELIiR TLREQDDELR 420 

KNVTGILWNL SSSDKI.KDRI1 AHDTLEQLTD LVLSPLSGAG GPPLIQQNAS EAEIFYHATG 480 

10 FIjaJLSSASQ ATRQKMRBCH GI.VDALVTSI HHALDAGKCB DKSVENAVCV LRMLSYSLYD 540 

EMPPSALQRL EGRGRRDIiAG AFPCEWGCF TPQSRRIiREIi PIAAOALTFA EVSKDPKGIjB 600 

WLHSFOIVQIi YNEtLUlRCEL NRBTTBAAAG ALQUriTACSR SWAGVLSSLA LBQERILNPL 660 

LDRVRTADHH QbRSIiTGIiIR NI.SRHARNKD BtSTKWSHL lEXLPOSVQB K9PPAEVLVH 720 

IIAVIJOILW ASPIAARDLL YFDGLRXLIP ZKKKRDSPOS EKSSRAASSL UUnMQYIiKL 780 
15 HRDFRAKGYR KEOFIiGP 

Seq ID NO: 352 DMA sequence 
Nucleic Acid Accession »: M31469 
20 Coding sequence: 1-651 

1 11 21 31 41 51 

I I I ' I I 

^_ ATGGCTGCGC AGGGAGAGCC CCAGGTCCAG TTCAAACTTG TATTGGTTGG TGATGGTGGT GO 

25 ACTGGAAAAA CGACCTTCGT GAAACGTCAT TTGACTGGTG AATTTGAGAA GAAGTATGTA 120 

GCCACCTTGG GTGTTGAGGT TCATCCCCTTA GTGTTCCACA CCAACAGAGG ACCTATTAAG 180 

TTCAATGTAT GGGACACAGC CGGCCAGGAG AAATTCGGTG GACTGAGAGA TGGCTATTAT 240 

ATCCAAGCCX: AGTGIGCCAT CATAATGTTT GATGTAACAT CGAGAGTTAC TTACAAGAAT 300 

CnGCCTAACT OOCATAOAaA TGTCGTACSA OIGTOrOAAA ACATCCCCAT TGTGTTGTGT 360 

30 GGCAACAAAG TGQATATTAA GGACAGQAAA GTGAAGGCQA AATCCATTQT CTTCCACOSA 420 

AAGAAGAATC TTCAGTACTA OQACATTTCT GCCAAAA8TA ACTACAACTT TQAAAAGCCC 480 

TTCCTCTGGC TTGCTAGGAA GCTCATTGGA GACCCTAACT TGGAATTTGT TGCCATGCCT 540 

GCTCTCGCCC CACCAQAAGT TGTCATGGAC CCAGCTTTGO CAGCACAGTA TQAGCAOQAC 600 
. TTAGAGGTTG CTCAGACAAC TGCTCTCCCG GATGAG6ATG ATOACCTGTO A 

Seq ID NO: 353 ProteJ 
Protein Accession tf: 

I 11 21 31 41 51 

40 I I I I.I I 

MAAQGEPQVQ FKLVLVODGQ TGKTTPVKRH LTGEFEKKYV ATLGVEVHPL VPHTNEGPIK 
FMVHOTAGQB KFQGLRD6YY IQAQCAIIMF tWrrSR VTYlCM VPNMHRDLVR VCGNIPIVLC 
GNKVDIKDHK VKAKSIVFHR KKNU3YYDIS AKSNYNFBKP FIiHIiARKLIQ DPNIiEFVAMP 
AUVPPEWHO PALAAQYGHD LEVAQTTALP OEDDDL 



Seq ID rXOi 354 OHA sequence 
Nucleic Acid Accession »> KM_002820 
Coding sequence: 304-831 



11 21 31 41 51 

I I I I I I 

CCGGTTOSCA AAGAAGCTGA CT-TCAGAGGG GGAAACTTTC TTCTTTTAGQ AGGCG6TTAO 60 

CCCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTG CTATGGGAOA 120 

55 CGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC C3VTTTTATTT TCGCTATTAT 180 

TTCAGAGGAA GCGCCTCTGA TTTGTTTCTT TTTTCCCTTT TTGCTCTTTC TGQCTGTGTG 240 

GTTTGGAGAA AGCACAGTTG GAGTAGC03G TTGCTAAATA AGTCCCGAGC 6CGAGCGGAG 300 

ACGATGCAGC GGAGACTGGT TCAGCAGTGG AGCGTCGCGG TGTTCCTGCT GAGCTACGCG 360 

GTGCCCTCCT GCGGGGGCTC GGTGGAGGGT CTCAQCCGCE GCCTCAAAAG AGCTGIGTCT 420 

60 QAACATCAGC TCCTCCATQA CAAGGGGAAQ TCCRTCCAAa ATTTAajGOS ACOATTCTTC 480 

CTTCACCATC TGATCGCAGA AATCCACACA GCTGAAATCA OAGCTACCTC GQAOSIGTCC 540 

CCTAACTCCA AGCCCTCTCC CAACACAAAG AAOCACCCCQ TCCGATTTGQ GTCIGATGAT 600 

GAGGGC3«3AT ACCTAACTCA GGAAACTAAC AAGGTGGAGA OGTACS^AAOA GCAGCCGCTC 660 

AAGACACCTG GGAAGAAAAA GAAAGQCAAG CCCGQGAAAC GCAAGGAGCA GGAAAAGAAA 720 

65 AAACGGCGAA CTCGCTCTGC CTGGTTAGAC TCTGGAGTGA CTGGOAGTGa GCTAGAAGGG 780 

GACCACCTGT CTGACACCTC CACAACX3TCG CTGGAGCTCG ATTCACGGTA ACAGGCTTCT 840 

CTGGCCCGTA GCCTCAGCGG GGTGCTCTCA GCTGGGTTTT GGAGCCTCCX: TTCTGCCTTG 900 

GCrrGGACAA ACCTAGAATT TTCI»3CTTT ATGTATCTCT ATCGATTQTa TAGCAATTGA 960 

__ CAGAGAATAA CTTCAQAATAT TGXCTaCCTT AAAGCAGTAC COCC CTACCA CACACACCCX: 1020 

70 TGTCCTCCAG CACX:ATAGA6 AGGGGCTAOA QCCCATTCCI CTTTCTCCRC CGTCAOCCAA 1080 

CATCAATCCT TTACCACTCT ACCAAATAAT TTCATATTCA AGCTTCAfiAA GCTAGTQACC 1140 

ATCTTCATAA TTTGCTGGAG AAGTGTATTT CTTCCXXTTTA CTCTCACACC TGQGCAAACT 1200 

TTCTTCAGTG TTTrTCATTT CTTACGTTCT TTCACTTCAA GGGAGAATAT AQAAGC3VTTT 1260 

GATATTATCT ACAAACACTG CAGAACAGCA TCATGTCATA AACGATTCTG AGCCATTCAC 1320 

75 ACTTTTTATT TAATTAAATG TATTTAATTA AATCTCAAAT TTATTTTAAT GTAAAGAACT 1380 

TAAATTATGT TTTAAACACA TGCCTTAAAT TTGTTTAATT AAATTTAACT CTGGTTTCTA 1440 

CCAGCTCATA CAAAATAAAT GGTTTCTGAA AATGTTTAAG TATTAACTTA CAAQGATATA 1500 

GGTTTTTCTC ATGTATCTTT TTGTTCATTO GC3U«3ATGAA ATAATTTTTC TAGGGTAATG 1560 
CCGTAGGAAA AATAAAACTT CACATTTAAA AAAAA 

80 



I I I 1 1 I 

MQRRIiVQQWS VAVFLI.SYAV PSCGRSVBGI. BRRI.KRAVSB HQLLRDRGKS IQDLRRRFPIi 
HHLIAEIHTA BIRATSEVSP HSKPSPHTKIt HFVRFGSDDE GRYLTQBTNR VETYKEQPIiK 
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55 



WO 02/086443 

TP6KKKKGKF GKRKEQEKXK RRTRSAHLOS aVTGSGI.ECH) HLSOTSTTSIi ELDSR 

Seq ID NO I 356 VtOi sequence 
Nucleic Acid Accession NM_017S22 
Coding ceguence: 1-2100 

1 11 21 31 41 51 

I I I I I I 

CCCTCTCCGG CTTCTGGCGC TGCTGCTGCT GCTGCTGCTG 
GCATCTTGOG GCGGCAGCGG CTGATCOGCT aCTCGOCGGC 
CX3AAAAG6AC CARTTCCAGT GCCGQAACX3A GCQCIGCATC 
OCCTCroiGT GGAQATQCQA CGAGGAGGAT OACTGCTTAG ACCACAGOBA CGAGGACGAC 



PCT/US02/12476 



GAACGGTGOA 
ACTTGCACCA 
TGTGTACCTG 



AC31TGTGTCC 
GAAGCTGGCT 
ATCTGCSVCTQ 



AGACCTGTGC 
AGTGTQACXKJ 
AGCAGQTGTa 
CCTOGTGGCG 
CTACCTCACT 
TTGCAATCAA 



TGTGTCAATT 
CTGACXaUVGA 
ACGAGTGCX3G 
AATGTCGTGG 
TACCQTAAGA 
CTCATTGACG 
ATCTACTGGA 
GGAGGCACTC 



ACCTCRAGAT 
CTTGTGGCQA 
ACAAfiGGCTA 
ACTGCAAGGC 
AGGATOGACC 
CACTAGATGT 



GGGCACCTGC 
QCACTGCAAC 
QCTGAAOGAG 
TGGCTTTGKA 



TTCACCTGTG 
TGTCXTGATQ 
AAGCTGAGCT 
GAGAAGGACT 
CGTGGGGACQ 



TGCACXSTGCC 

tgcaaqgacc 
gagtSctacc 
aagagcccat 
aactattcac 
accaatcgca 
aaggccactg 



ACAAOOQCCA CTGCATCCAC 
GCTCCQATGA CrCCGAGGCX: 
GTGGACCCAC CAGCCACAAG 
GCGA6GGTGG A60GGATGAG 
AGTTCCAGTG TGGGGATGGG 
ACTGTCCAGA TGGGAGTGAT 
ACAATGGOGG C 



CAQMX3CCTO CSkGCCAGATC 
CTGGCTG08A GATG6ACCTA 
CCCTAATCTT CACCAACCGC 
GCCTCMCCC CATGCTCAAG 
TCTACTGGTG TGACCTCTCC 
ACCOSAAAQA GOSGGAGGTC 



;: ATCTCnGTGG C 



CTGGATCTGC 
ATTGACTTCA 
CCTTTTQGGA 
ATTTTCAGTG 
AACCCACATG 



TGTATTGGTC 
ACCGGCSWkC 
TGAGCCAOCG 



TCACIGGGOG GACCAG6CXA AGATTOAGAA ATCTGGGCTC 



ATGAAGAGOT 
QCTGTTATCS 
CTGATCTGGA 
TACAGGAAAA 
ATTGGCCATG 
GGATGGGATC 
GTGTATQACT 
GATTTTTTTT 
ACATCCAAAG 
GCACTACXXav 
AATGGGGGCC 



TAGCTGTGTT 
CAAATCGGCT 
ACATTGTCAT 
TCX»GCCTAA 
CTCCCAAGTA 
GCTACCGAGA 
GGATCATCGT 
GAAACTGGAA 



ACTGGTCTCA 
CTTGTACTGG 
CAGAAAGACG 



TCTATCCTGC 
ACCCCCTTCG 
GGATQAATGG 
TTTAAATTTA 
GATGT6AGAG 
TGAGGAATTC 



ATGCTTTGTG GCTATCCATC 



CAATGGCCTQ 
CTTCCATGAG 
TGQAGGCTGT 
CACATGTGCC 
TGCAAATGAA 
QCCCATAGTG 
6C6GAAGAAC 
AGAAGATGAA 
ACGAGXGGCA 
TGCCTCATGG 
GTTTCTATAT 
TGTTGCGGAA 
TTTTTCTATG 
GTGGAATGGC 
TACCTTACTC 
TTTAGGTTTT 
AACATAAGT 



GTAGACTCCA 
CTGATCTCCT 
GTGTTCTGGA 
GAAATCTCCa 
CTGAAGCAGC 
GAATACCTOT 
TGTCCTOACa 



AATGGCCCAA 
AGCTACACCA 
CCACTGACTT 



GTGATAGCCC 
ACCAAAAGCA 
GATGAGCTCC 
TTAAGCCTTO 
AATTCAGTCC 
ATGGGTCTGT 
AGGTAACCAC 
TATAATGTTT 
TACTGCTGAC 
ATCATTTAAA 
GGGCATTTGT 



TCCTGGCTGA 
CAAGAGCTCK 
GCCTTCCTGC 
CAATGTGGCT 
TGGGCTCAAC 
TCCTGTGCAT 
TOAATTTraA 



ACTGTCCRGC 
CCTGAGCCAC 
GAACGAGGCC 
GAACCTCAAC 
AGATGCCTGT 
TCCTCAGATC 
GGGTCCAGAC 
AGTCACTGCC 
GAGTGGATAC 
CAACCCAGTC 
AACTGCTCAG 



CATGCACIAC ACTCGaOATG 
GTGAGT6TAT aTGTGTGTGT 
AAAOTT ATQA TGAACT6CAA 
TATACACTTT TTAACTGGTP 
TAACATGATS CACATAACCA 
AACTATAT TT ACAOAAGATG 
TTTTTQTAAA TAAOATGATT 



2040 
2100 
2160 
2220 

2340 
2400 
2460 
2S20 



60 
65 
70 



TSAEDRPVKR NYSRI.IPMLK HWALDVEVA TMRIYWCDIiS WUCIYSAYMD KASDPKERBV 
LISBQUISFE GUWDWVHKH lyWTDSGHKT ISVATVDGCBt RRniFSRHIaS EPRAIAVDPL* 
RGFMYWSDWG DQAKIEKSGL NGVDRQTLVS DNIBWPNGIT UJIiLSQRLYW VDSKLHQLSS 
IDFSGOIRKT LISSTDFLSH PFGIAVFEDK VFWTDIiEHEA IFSANRIiHGIi EISIUVENIiH 
NPHDrVIPHE LKQPRAPDAC ELSVQPNGGC EYLCLPAPQI SSHSPKYTCA CTOTMMLGPD 
MKRCVWIANE DSKMGSTVTA AVIGIIVPIV VIALLCMSGY LIWHHWKRKll TKSMNFDNPV 
YSXXTEEEDE DELHIGRTAQ IGHVYPARVA LSLEODGI^ 

seg lO NOi 358 I»IA sequence 
Nucleic Acid Accession fts 1427826 

Coding sequence: <1-S03 



A6CCCAAGAA A 



: AATTTCAAAT 



CTCTCACAGT 
OGTTGCCTTC 
GCCAAGCTTC 
TTATGCACTC 
TAACCAAATT 
TTCTTCCCAA 
CACAAGACCT 
CACTTGAAGC 
TTCGCOGCTC 



TTXTCAAGOG 



TTTTTTAGTT 
ATCTGCTTCC 
TCCAAAGCCT 
CCCTTCASCT 
AGCCCTGAGA 
CAACACTTCA 



TCCATCCCCr 
CCTQTTTCCC 
AAAACTCCCC 
ATCCCCACCT 
CTGACTATTC 
CCTTTGTGTC 
TAATCTCTCC 
AACATC6CCC 
ACACTATTTT 



CTQATCTATT 
GOACChTCAC 
eTTTAATOGA 



OGGCTTAGCG 



CACTCTQGTQ 
GCCCACTTCC 
CTGGAGTACA 
CTCTAACATC 
CACTCTAGGT 
ATTCTCTCTC 
GTTTTATTTG 



AACIGTT6T6 
CCAACTTGGA 
CTTATTAGQC 
GCTACATCTC 
CCCACAATAT 
TCCCACGCOQ 
CATACCACCC 
TCTTATTAAT 



ACTGAAGATT 
CTTCGGGTAA 
AOOCaCTCCA 



CAACACTCTT 
CQAAATATTT 
ATTGCTGCCC 
CAGCCCTTAC 
CCCCTAATCC 
CCCAAAAATT 
ATCAGAAGGC 



320 



ICKr CCCCTOIOAC TTCCAOQTAT 
ACATCCAGAT GOCCTGAAST AACTGAASAT CCACAAAAQA AGTAAAAACA GCCITAACTG 
ATGACATTCC ACCATTGTGA TTTGTTCCTG CCCCACCCTA ACTGATCAAT GTACTTTGTA 
ATCTCCCrCA CCCTTAAGAA GGTTCTTTGT AATTCTCCCC ACCCTTGAGA ATQTACTTTG 
TGAGATCCAC CCCTGCCCAC CAGAGAACAA CCCCCTTTGA TTGTAATTTT TTATTACCTT 
CXCAAATCCT ATAAAACAGC CCCACCCCTA TCTTCCTTCA CIGACTCICT TTTCXSGACTC 
AQCCACOGGC ACCCAGGTGA AATAAACAGC TTTATTGCTC AC 



IPCTAJS02/12476 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



! I I I I I 

PKKHLTNFKS DLFGIATEDW RCPIASBVPW TITBABLRVT LTVEGKSIPC LIDTGATHST 
LPSFQGPVSL APITWGIDG QASKPIiKTPP LHOQLGQHSF MHSFLVXPTC PLPTJiGRMIL 
TKLSASLTIP GVQLHLIAAL LPMPKPPIiCP LTSPQYQPIiP QDIiFSA 

Seq ID NO: 360 DNA sequence 
Nucleic Acid Acceeelon «i NM_001S54 
162-5582 



CCCCAATGGC 
CAGTCGCAAT 
TGGTCCTCIA 
TTQACCTTCC 



\. TTTAGAAGAA AAAGCCCTTT G 



GGTGGAAAAC 
TCTTCCAAGC 
TTCACAATTC 



AACAQTTATT 
CAAAAAAAGG 
TTGGTGTTGA 
CCCCAGAAGA 
TAGCAATCAG 
CX3AAACCACT 
GAACAAGGAT 
6TGATCCCAA 



TOSAATATGA 
GACCCACTGT 
AAGAATACAA 
GGACAAATGA 
ATTATGATTC 
GCAGGGATTC 



GAGTTCACAG 
QAAACGGTGG 
TAGAGAGGTC 
TCCAGASGGA 
AGATACTGCT 
TCCAGGTGGA 
AATTC3U3TCT 
GGTTGGGAGA 
CTATCCCCTC 
CGTGGAGAAG 
TGATAGAAGT 
TTTGGATGAA 
GGCAGCATAT 
T6CTCAAGCT 
CTATGAGTAT 
AACTGAGGAG 



C GGGGCTCXiGA GATGGAGCCG 
r TCACCGTAAC 



CAGAAACTGA 
AGAAAGQAGA 
CAGQACCTGC 



GCCAAATCCa 
CCAGAGGAAA 
TGATCTTCTG 
AGATAAACCA 
TATTACAGAA 
ACX»GC3MSTG 
AGGTATTATG 
TAGGGG CCCC 
TACTATGTTO 
TGCTCAGGAA 



ATATCAAAAA 
TACAQASTTT 
ACTTTCCCAG 
TTCCTTTTAT 
TCACCTGTTT 
TTCAGAACTG 
AAAACTGTGA 
GA6AGAGCAA 
OAAGTTTTTG 
GACTACTGTG 
CAGGAACCTC 
GGGGAAGCAG 
ACAATAGCAC 
ATGGAAAGTT 



CAACQG6ATT T 



AAGACTTTTC 
CTATATATAA 
TTCTGTTTGA 
TTAACATOGC 
CAATGATTGT 
TTGTTGATAC 
AGGGGGACAT 



AATACTATTT 
TGAGCATGGT 
AQACCACACT 



ABAAAGAATT 
QCOCCAACAA 
ACAGTAAAAC 



AATTCTGAGG 
GTAGATGGAG 
ACAAGCCCCC 
ACAAGCATAA 
GTTGAGCCTG 
GGTCCTCCAG 
CCAGGACGTC 
ATOTTACOGT 
GCTCAGGCTC 



AGAXAGAXGA 
AGTATAAAGA 
AGACGGAGGC 
ACCAGACAGA 
TATTTACTGA 
ATACACTATA 
ATTTAGGCGA 



CTAATGAAGA 
ATGGCCATGG 
GTATGCTTGT 
GTCTACAAGO 
CTGGCTTACC 
TC0QTTATG6 
AA6CTATTCT 



TGATTGTAAG 
CAATGGAATC 
TCAGCAQTTT 
TCXAGACTGT 
GTATGCACCA 
GGCTGAAAGT 
AAACATCGTT 
AGCTCCTAGG 
A6AATATCTA 
TGAAAACAAA 
ATATGAT 



ATTTGGTCCA 
TGCATATGGA 
tX3AAGGACCA 
CCCCACTGGA 



AAGAAAACCA 720 
ACGGTTTTTG 780 
TTGATCACAG 840 
GACTCnCAG 900 
GAGGATATAA 960 
GTAACAGAQG 1020 
GATGATTTTC 1080 
CATGTTTCTG 1140 
1200 
1260 
1320 
1380 



GAAATAGACG 
TATGAATATA 
GGTGTACCA6 
GAGAAAGGAC 
CCAGGACCAQ 
CCCCCTGGTG 
GGTCTACCTG 
TCCAAAGQAC 
CGQATTGCTC 



GTCTGCCAGG 
CTGGTGATGA 
AAGCTGGCCC 



GGCX3U^AGGT 
TGGTCCAACG 
GCCAGGAGAA 



TGGA ATGAGG 
AC6AGGTTTG 
TGTAGATOGC 



ATCCAGGTCC TCAGGGCCCT CXSAGGOGTCC 
GGAAAACCTG GAAAAAGGGG TCGTCCAGGT GCAGATGGAO 
CCTGGGGCAA AGOGAGATCG AGGGTTTQAT GGACTTCCGG 
CACAGGGG3G AAOSAGGTCC TCAAGGTCCT CCAGGTCCTC 
GGAGAAGATG GAGAAATTGG ACCAAGAGGT CTTCCAGQTG 
CTGGGTCCAA GGGGAACTCC AGGAGCTCCA GGGCAQCXira 



TGGGTOCCCC 
CAGATGGTGT 
GATTCAAAGG 
GAGGGNAAGA 
CTTCAGGTCA 



GCCTCCTGOT 
TGOTCCACAA 
CA6AGGTCTC 
TGACATGGGT 
TGGCCCTGAA 



GGTCCTATTG 
AAGG6ATCTA 
CTAAAAGGTG 
GGACCCAAAG 



3 GACCTCAGGG TCTT CCTG OT CCR CAAOG TC 
: AOQACITGCr GGACTTCCra 
\ OrCTGGAGAA 



OAAOACAAGa TCCAAAG6GT TCCACI 



GATNNCCG6S CCCCCGGGGA GTAAAGGOAQ 
AAGGTGAAAA GG6TGAAGAT CGTTTTCCA6 
ACAGAGGAGA AGTTGGTCAA ATTGGCCCAA 
GTCXJAGCAGG CCCAACT6GA GACCCAGGTC 
TTGGAGTTCC AGSATTACCA GGAXATCCAG 
TCCCIGG G TT TCCAGGX 



CTOQAGGTTC AAGAGGTGCSV. AGAGGTCCCA CIGGGAAACC TGGGCX3lAAa GGCACTTCAG 



TTGGATTCCC 
ACCCTGGGCA 
GAGTGQTTGG 



CCCTCCTGGC 
TGGACCAAAA 
ACGTGGGGAG 



: TGGCCCTCCr 
\ TCCAGQTCCT 
3 GGAAAGAaGT 



CCTCCAGGTG 
GGCxrCTCCTO 
ACTGGATTTC 
CCAACCGGTQ 



AAAGAGGTCC TCAAGGACCT CAGGGTCCAQ 
GACCACCAGS AAGGATGG6C TGCCCAGGAC 
AAGGCAAGAC 
AGACTGGTCC 
GTCTTCCTGG 
CAGGGAAAGA 
CTCAGGGTGC 



ACAAGGGTQA 



TGCTCCTGGA 
TGGTCTCCCA 
AATTGGTGAG 



GCCCTCCCGG TCCCCCAGGT 



ATGGTGAACC 
CCAGAGGCTT 
GTGAAAAAGG 



AGGTCCTAGA 
CCCTGGACCT 
TGAAAATOCG 



GAAAAAGGTC CCCAAGGGCC 
GGGCCAGCX6 GTCCTGCCGG 
CCGOQACAAA AAGGCAGCAA 
CTTCAAGGAC CAGTTGGTGC 
GGACAGCAGG GGATGTTTGG 
CCTGGTCCAA TAGGTCTTCA 



TGGACCAGCA 
ACCTG6ACTQ 
AGAAC6TGGG 
GGGTCCTCCT 
TGCAGG6AGA 



CGTGGGTATC 
AAAGAAGGTG 
GGATTACGTG 



GGGTGGCAAG 
CCCTGGAATT 
GCSUiAAAGGT 
GGGTCTGCCA 
ACCTGGTCCT 



TCAGCAGGTA 
GGTCCAGCTG 
GATGGAGTTC 
GAAGAOGGA6 
GGAGAAAATG 
GCTGGAGGTG 
GATGAGGGTG 
GGCCCACXTTG 
CCAGGCXX3A 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 

2040 
2100 
2160 
2220 
2280 
2340 
2400 

2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 

3300 
3360 
3420 



3660 
3720 
3780 
3840 



321 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



WO 02/086443 

GAGGCCCTCA AGGTCCCAAT 
CAGTTGGTGG TGTTGGAGAA 
GGGAAGCAGG TGTAGGCGGT 
CTGGAGCTGC TGGACCTCCA 
ACCCGGGTCC TGTTGGTTTT 
GTCAAGATGG TGTTGGT6GT 
CTGQCCCATC TGGTGAGGCT 
CTGCAGGTGC AGAGGGRAGA 
GTCCTCCTGG AAAAACCGGC 
AAOaTCTTCa GGGCATCCCT 
AAGATG6ACC ACCTGGTCCT 
GCTCCAAGGG TGAAAAGGGA 
AA6GGGAAAA AOGTQACCGA 
ATGGGGGAAT TCCTGGTCCT 
GTCCTCAAGG CCCAAAGGGT 
GTGGTCTTCC AGGGCCTCCT 
CAATCTTGTC CTCCAAAAAA 
ATAATATTCT TGATTACTCG 
AACAAGACAT CGAGC3VTATG 
GTAAAQACCT GCAACTCAGC 
ACCAAGGTTG CTCAGGAC5AT 
CTTGCATTTA TCCAGACAAA 
AACCAGGAAG TTGGTTTAGT 
AAGGAAATTC CATCAATATG 
GGCAAAATTT CACCTACCAC 
OTIATGACAA AGCACTTOSC 
ATCCTTTTAT CAAAACACTG 
TCATTGAAAT CAATACACCA 
ACTTTGGTQA TCAGAATCAG 
AAGATTAAGA CAAAGAAC3VT 
TTTTGTGCCA CATGCAAGTT 
TACCATTTAG GAAATACOGA 
ATCATAAAGA TA TAAGTTGG 
TTCTCAACTC TCCTTTTCCT 
AATATATATT CATAAAAAAT 
TGTGTTTAAT AAATTGTAAT 
CCAAAACTTQ CRCGTGTCCC 
GAXGGCAATA ATATATGTAT 
TTTCTTTGGT TAATGATGAA 



11 



GGAGCTGATG 
AAGGGTGAAC 
CCCAAAGGAG 
GGXGCCAAGG 
CCTGGAGATC 
GACAAGGGTG 
GOCrCACXawS 



CTGGTCCTCC 
AAGATGGAGA 
GTCCTCCTGG 
AAGGTGCTAA 



TGAIGATGOC 
TGGGGAACTT 
TCXTTOGTCAA 
AAAACGAGGT 



GAGAAC3UU3G 
CTOGCTTACC 
TAATTGQCCT 
GAACTCAAGG 
TAGGTCCACC 
CTACTGGACC 
GTCCACCTGG 



CCAGTCGGTC 
GGTCCTGTGG 
ATOGOACCTC 
CATCCTGGTT 
GGGCTCCCTG 
GCTGGTCCCT 
AACAAAGGCT 
GGGCCTCCAG 
ACGAGAAGAC 
GATGGAATGG 
AAATTTCCAA 
CATCCTQACT 

TCcrrcKAAS rrrACTOTAA 

AAATCTQAGO 
GAATTTAAGA 
GTGCAAATGA 
TGTCATCAGT 
TTCCTGGGAT 
TATGATGQTT 
AAAATTSAtC 
AAGTTOSSAT 
ATCAAATCAA 
TTG ARTAA GG 
TGCCTTTGTG 
TOTGQCTAAG 
ATTTOAATTT 



TATTTTGTGT 
TGAATTCCGC 
TATGAAAATQ 
ATTCCTTTGT 



GGGGAAAACT 
CATTCCTGAA 
CAGCAGCCTG 
CAAATGATGA 
GTA06TOCAG 
AAGTACCTAT 
TTGAAGTTGG 
CAGAAAATOT 
ATCIATGGAA 
GGGGCAGAAT 
ATGOAAACAG 
CTTTGGTGCT 
TTCTCATCCA 
ACAGTTCTAT 
TSACTCTAAT 
AAGTTATSAT 
GTGTGTTT 



TGCAGGAAAG 
TCTCCCTG6A 
TGGTCTCAAA 
GATTGGTCCT 
ATCTCCAGGA 
TGGTCCTCCA 
CGCTGGCCRG 
TGAAGTC3VTT 
CATGCAAGCA 
TGGTTCCCTC 
QACCAATCCA 
TGAATATTGG 
TTTCACATCT 
TTCATCATGG 
GCTTTCATAC 
ACTTCTGACT 
GTATGATGTG 
GGAGATGTCC 
AAAAGGCTAT 
TOTTGATGTC 



GCTGGTCCAC 
CCTAAGGGTA 
GGCCCTGCAG 
CCGGGTCCTC 
CCTCCTGGAG 

qgtgcagaag 
cctggtccag 



CCAGOAGAAC 

gcaaaagggg 
ggcttaccag 
aaaggtgaca 
cagcctttac 
gatgcagatg 

AATTCCCTQA 
GCCOGAACTT 



AACAACGCTG 
CACAOACAAA 
GGCTGATTCT 
GTAGAAAACA 
TCCAGGATGT 
actqttatct 
ttatqaqgat 



CCAAAGGAGA 
TTAGATGTTG 
GCCTCTGCTC 
TCATCAGGAA 
TATGACAATA 
GAAAAAACTG 
Ala ATCftG TG 
TTTCTTOaCT 



CATATACAGG 



TGATTCCCAA 



ACTAAAACAG 
GTGTCCATTT 
GCCGAACTCT 
CCTAAGTCCC 



3900 
3960 
4020 
4080 
4140 
4200 



4380 
4440 
4500 
4560 
4620 
4680 
4740 

4860 
4920 
4980 
5040 
SlOO 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
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51 



I I I 1 I 

MEPHSSBMKT KRWIiWDPTVT TLALTFLFQA REVHGAAPVD VLKALDPHNS PBOISKTTGF 60 

CTTJRKHSKGS DTAWIVSICQA QLSAPTKQLF PGGTFPEDF3 IliFTVKPKKG IQSFI^SIYN 120 

EKGIQQIGVE VGRSPVFLFB DHTQKPAPED YPI.FRTVHIA DOKWHRVAIS VEKKT VTMI V 180 

DCKKKITKPIi DRSERAIVDT MGITVFGTRI LDEEVPEGDI QQFLITGDPK AATOYCEHYS 240 

PDCDSSAPKA AQAQEPQIDE VAPEDIIEYD YEYGEAEYKE AESVTEGPTV TEETIAQTEA 300 

NIVDDFQEVN YGTMESYQTB APBHVSGTNB PNPVEEIFTE EYLTGEDYDS QKKNSEDTLY 360 

BNKEIDGRDS DIiLVDGDLGE YDFYBYKEYB DKPTSPPNEE FGPGVPAETD ITBTSINGHG 420 

AYGEKGQKGB PAVVBPOII.V EGPPGPAGPA GIMGPPGLQG PTGPPGDPOD BGPPGRPOLP 480 

1 TMLMIiPFRYG GDGSKGPTIS AQEAQAQAIL QQARIAIiRGP P0PM3LTGRP 540 

PGEPGAKGDR 600 

PSGIiPGEAGP RGIiLGPRGTP 660 

LPGPQGPIGP PGEKGPQGKP 720 

PRGVKQADGV RGIiKGSKOEK 780 




GGKGENGPPG 
GLPGPPGEKG 
(an>GPPGEAG 
GEIiGPAGQDG 
GEAGAEGPPG 



FOGPTQBTGP 
ERGLPGAQGA 
APGEKGPQGP 
PFGLQGPVGA 
EMGDVGPWGP 



QGPQOPVGFP GPXGPPGPFG 
IGERGYPGFP 6PPGBQ6LFG 
PGI.KGGEGPQ GPPGPVOSPQ 
AGROGVQSFV GIiPGPAOPAG 



Z KGBAGPPGAA GPPQAKOPPG 



KrOPVQPQGP / 



IGPFGEQ6EK OSRGIfOTQa G 
GPPGLPGPQG PKGHKGSTGP AGOKODSOLP GPBGPPgPPG E 
NQADADDHII. OYSDGMBBIF GSUISLKQDI EBMRFPH6TQ 1 
EYMIDPHQGC SGDSPKVYCM PTSGQBTCIY PDKKSE6VSI £ 
LSYLOVEGNS INHVOMTFLK UtTASARQNF TXBOIQSAiW Tl 
EMSYDNHPFI KTLYDGCTSR KSYEKTVIEI KTPKIDQWPI V 



Seq ID HO: 362 IBIA sequence 
nucleic Acid Accession #> fm_ 
Coding sequence > 351 



ERGSAGTAOF I0LR6RPGPQ 
SPQBDQIKQE IGEPGQKGSK 
QXtSBGARGF FGPPGPIGLQ 
PPGSVGSVGG VGEKOEPGBA 
DOGPKGHPGP VGFPGDFGPP 
KRSPPGAAGA EGRQGBKGAK 
LFGAAGQDQP POFMGPFGIf 



1020 
1080 
1140 
1200 
1260 



85 



I I I I I I 

TTCCCCAGCA TTC3GA6AAAC TCCTCTCTAC TTTAGCACGG TCTCCAGACT CAGCCGAGAG 
ACAGCRAACT GCAGOXBGT 6ASAGAGCGA GAGAGAGGGA GAGAGA GACT CTCCAQ CCTG 
GGAACXATAA CTOCICTGCG AGAGGCGGA6 AACTCCTTCC CCAAATCTTT TGC3GGRCTTT 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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TCTCTCTTTA CCCAOCTCGG C 
CGTCTTCCCQ TTCGGCGTGT C 



AAACCAACAJV T6CCGAGAAC 
CCGGCCTOGA GCTGGGAATC 
GCAAGGCOGA CX3ACCCQAQC 
AOGCCTTCAT GGTGTGGTCG 
TGCACAACGC CGAGATCTCC 
ACAAGATCCC TTTCATTCGA 
COGACTACAA GTACCGGCCC 
CGGCCQCCGC CTCCTCCAAQ 



ACX»3AAGCGC 

GcxrrccTccc 

TGGTGCAAGA 
CAGATOGAGC 
AAGCGGCTGG 
GAGGCGGAGC 
ASQAAGAAGG TGAAOTCOGa 



TGCTGGCCGG 
CCACX3CCCGG 
C(XCQAGTGG 
GGCGCAAGAT 
GCAAACGCTG 



GCCAGTTOQQ 
AGGGCCCGGC 
AGCCGAGGCC 
CGAGAGCTCO 
CTCCACCGCC 
GCACATCAAG 
CATGGAGCAG 
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ATGGTGCAGC 360 



CCGGOGQAGA 



CGGGCGGCGC GGGCGGTGOa 
GCJGGCGGCGG GAAAGCAGCG 
COGCCGCCCT GCTGCCCCTG 
CTCCCAGCGC CTCGGCCrcC 
CGGGCAAGCA CCTGGCGGAG 
CGTCGTCOTC GCCCGTGGGC 



TCCAAAC03G 
GTTAGCAAAC 
GCTGCCGCCQ 



GCAGCAACGC 
CGCAQAAAAA 
CX3CACGCCAA 
CCGCCTCCTT 
CCGACCACCA 
CAGCXTTCXJOC 
AGOGCGTCTA 



CAAGCACATG 
CAACGCCAAC 
GGTCGGTGGC 



QAGCTGCGGC 
GCTCATCCTG 
CGCCGCCGAA 
CTCGCTGTAC 
CTCOGCAGCG 
CCTGTTOGGC 
CCCCAGCGAC 
CAGCCTGAGC 



CAGGCGGGGG 



CCCACTCCTC 
ACCTGCTOGA 
OGTCGTOGGC 
TOGAGTTCCC 
AGTCCAGCAT 



CTcrrccTcc 

CCTGAACCCC 
GCTCGACCGQ 
GGACTACTGC 
CTCCAACCTG. 
AGGAGAGGAG 



TCCTCGGGCT 
AGCTCAAACT 
GACCTGGATT 
ACGCCOGAGG 



CS3CCCTCGCA 
CCTCQTCCTC 
TTGAGAGC3VT 
TTAACTTCGA 
TGAGCGAGAT 
ACTGAAGGGC 



CGCGTCCTCX: 




AAAAAGTAAG 
GTTTTGGACC 
GGAGGGACGC 
TTGTTGTTGA 



GTCCCTGGGC 
GCCCGGCTOC 
GRTCTCX3GGA 
GCGCAGGCAG 
AAACX3AAAA0 
CAGGGCTCGT 
OGCGCTCCXa 
GGAGGAGQAA 



tccacgggcg 480 

cgaccx:atga s4o 

tcgcccgaca 600 

aaagacagcg 660 

gctqactacc 720 

TCCAGCTCCT 780 
AGTQGCGGGG 840 
GGCGGCGGTG 900 
TCCAAAGTGG 960 
1020 
1080 
1140 
1200 

1320 
1380 
1440 

1560 
1620 
1680 

GACTGGCTCO 1740 
GGAGAAGGGC 1800 
GACAGACCiAA 1860 
TOGCCCX3CGT 1920 



CTCGOSGCCC 
GGCCTGGGCA 
CCCCTGGGCC 



TACGCCAGCC 
TCX3GCCTCGT 
TTCQAAOACG 
AGCTTCAGTT 



TCCCCCACCT 
GAGGGTAGAC 
AAAAAGCGAC 



CCCCCTCCCC CTTCCAAOGA QCTTCOGGAC 



TCCCBTTTGG 
GGAGATGTTG 
CCCGTTGOAA 
SaGGGTTCAC 
T6GGCCTC08 



GACCCOGGAG GCGTGGA6GA 



GGOGAGTGOT TTCGQAAAAA AAAAAAGAAA AAAAGGG 



CCCCTTTTTT TAAACBOSTa ATQAAGACAG AAGGCTCC3GG 
GCAGATGTTT TGGGGGAAOG CCGGGACTGA GAGACTCCAC 
GGCCTTTTTT TCCTCCCTCT TTTCOXTTG CCCCCTCTGC 
AGGGGAGGAG GCCAGCCAGT QTGACCGGCG CTAGGAAATG 
OCGCAGCAGC GGGAGCTAGG GGCGGGGGCQ GAGGAGGACA 
OGTCAAACTG AAATGGATTT GCACGTTGGG GAGCTGGOGG 
CCTTCTTTTC TACGTQAAAT CAGTGAG6TG AOACTTCOCA 
GAGGAGACTG TTTGATGIGG TACAGGGGCA GTCAOTGGAG 



Seq ID HO I 364 1 
nucleic Acid Accession #i UlOeSO 
Coding sequence i 123-2204 



GGOCTCCTTC 
CGTACTGTCG 
GACTCCAAGC 



TCRTAOACCG 
CAGCATTTGC 
CTGTGTATGC 
TTCTTGGAAT 



■GTCA 



TATAAAGGAA 
TGAAGATGCT 
TTGCTATGGT 



TCAGGGGCCT 



TCAGAAGGAA 
ATTCAAGGTT 
GTTATATGGA 
GAAGAATTTC 
ACXTGAGTGT 



TGGTTTTACT 
TGAACCAAGA 
GCCAGTCTGT 
ctcattcttt 



ACAAGTCATT 



ttggggatac ttttgttaao 



GAACTGTTCG 
CAAGGATTCC 

ccxrroGTTTG 

ATGCA6ATGA 
GGAGTTTTCA 
GAAGTTGTTT 
OTGGCAGGTT 
GCACAGTTCC 
CTTTATGATA 
ATTGGAGAQA 
GTAGACTCAA 
GCTGTGCACA 
CTCAAAAAGC 
AGAACftACCC 
TTAAATAT6A 
ATTGCCAAT6 
GGTACTTTAC 



TCAACCTCAG 
COGTCACOGC 
TGQAGAATQC 
TTCTGGATGC 
TGCAGTCTGA 
aTGCTATTAT 
ATCCAGCAAT 
TGAATAAGCT 



TGCTTACACA 
CTGGAAACAT 
ACCCTGAAQT 
TAGCTGGATG 
TCAAAGAQAG 
CAGTTTGTAC 
TTGATAATGG 
TTGGAATTCA 



ATTCACTATT 
ATTTGGAGGT 
GGATAATACA 
TGGAGATAGT 
AGTAGCAGGC 
TGGCCTTACA 
CAGTGGAACC 
AG TAGG CACO 
AGCTTTGCTA 
CTTTATGAGA 
GGTCAAAGTG 
AGATGAAGAT 
TGAAGAGAAA 



1980 
2040 
2100 
2160 
2220 
2280 
2340 

2460 
2520 
2580 
2640 
2700 
2760 



I LAA7GKHLAB KKVKRVTLFO 



CX3ACCCTTCC 60 

GCCCTGGCCC 120 

: CTTAAGGATG 180 

; TAOGGGAAAG 240 

TTGOAAACAC 300 

GGACCTAATT 360 

GGCAAGCCTG 420 

ACCGTGCACA 480 

TaTTCATTAT 540 

GTAGACAAAG 600 

ATAGCAAATG 660 

OAAAATGGAA 720 

TTC31CCGTGC 780 

TCAAAAGTTT 840 

AATCGTGCrr 900 



ATAAATGCra 
AGAACCCXAC 
AGAAAAATCA 



1020 
10 SO 
1140 



I AATTGAAAGT GCATC 



323 



wo 02/086443 PCTAJS02/12476 

TTGCAAGTOG CAAA6CIGAA CTCATCAAAA CCCATCACAA TGACACAGAG CTCATCAGAA 1320 

ASTTSAGAGIV GOAGGGAAAA GTA&TAGAAC CTCIGAAAGA TTTTCATAAA GATGAAGTGA 1380 

GAATTTTGGG CAGRGAACTT GGACTT(X»G AASA6TTA6T TTCKAGGCAT CCATTTCCAG 1440 

GTCCTOGCCT GGCSUVTCMA QTAATATOTO CTGAAGAACC TTATATTTGT AAGGACTTTC 1500 

CTGAAACCAA CAATATTTTG AAAATAGTAG CTGATTTTTC TGCAAGTGTT AAAAAGCCAC IS 60 

ATACCCTATT ACAGAGAGTC AAAGCCTGCR CAACAGAAGA GGATCAGGAG AAGCTGATGC 1620 

AAATTACCAG TCTGCATTCA CTGAATGCCT TCTTGCTGCC AATTAAAACT GTAGGTQTGC 1680 

AGGGTGACTG TCGTTCCTAC AGTTACGTGT GTGGAATCTC CAGTAAAGAT GAACCTGACT 1740 

QGGAATCACT TATTTTTCTG GCTAGGCTTA TACXTTOSCAT GTGTCACAAC GTTAACAGAG 1800 

TTGTTTATAT ATTTGGCCCA CCAQTTAAAG AACCTCCTAC AGATGTTACT CCCACTTTCT 18 SO 

TOACAACAGG GGTGCTCAGT ACTTTACOCC AAGCTGATTT TGAGGCCCAT AACATTCTCA 1920 

GG6ASTCTGS OTATGCTGOG AAAATCAOCC AGATGCGGGT GATTTTGACA CCATTACATT 1980 

TT6ATOGGGA CXXaVCTTCAA AAGCABCCTT CaTGOCAGAG ATCTGTGGTT ATTCGAACCT 2040 

TTATTACTAG TGACTTCATG ACTGGTATAC CTGOMCACC TGGCAATGAG ATCCCTGTAG 2100 

AGGTGGTATT AAAGATGOTC ACTGAGATTA AGAAQATTCC TCGTATTTCT OGAATTATGT 2160 
ATGACTTAAC ATC3kAAGCCC CCAGGAACTA CrGAGTGGGA OTAATAAACT TC 



1 H 21 • 31 41 51 

I I I I I I 

MALCNGDSKIi EWAGGDUaDG HHHYBGAWI IiDAQAQYGKV IDHRVRBLPV QSEIFPLETP 

AFAIKEQGPR AIIISGGPHS VYAEDAPVfFD PAIFTIGKPV IiGICTGMQMM HKVFGGTVKK 

25 KSVREDGVPN ISVDNTCSLF RGLQKEEWIi LiTHGDSVDKV ADGFKWARS OlIVAaiAHE 

SKKLYGAQFH PEVGLTENGK VILKHFLITOI AGCSGTFTVQ NRELECIREI KERVGTSKVL 

VLLSQGVDST VCTALLNKAL NQBQVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 

KSFVNGTTTL p'lSDEDRTPR KRISKnaiMT TSPBBKRKII GDTFVKIANE VIGEMNLKPE 

BVFLAQGTIiR POLIESASIiV ASGKAELIKT HRNDTBLIRK LREEQKVIEP LKDFHKDEVR 

30 ILQRELeiiPB ELVSSHPFPG POUVIRVICA BEPYICKDFP ETOKILKIVA DFSASVKKPH 

TIiLQRVKACT TEEDQEKLMQ ITSLBSLNAF UiPHCTVGVQ GDCRSYSWC GISSKDEPDW 

BSI.IFIARLI PBMCHNVHRV VYIFGPPVKB PPTDVTPTFL TTGVI.STIAQ ADFEAHNILR 

BSGYAGKISQ HPVILTPLHF DBDPLQKQPa OQRSWIRTF ITSDPMIGIP ATPGNEIPVE 

VVI>KKVTEIK KIPOI8RIMY DLTSKPPGTT EMB 

Seq lO HO: 366 DMA sequence 
NUdelc Acid Accession #> I9M_004219 
Coding sequence: 4G-654 

40 

1 11 21 31 41 SI 

I I I I I I 

GOGGCCTCAG ATGAATGOGG CTGTTAAGAC CTGCAATAAT CCAGAATGGC TACTCTGATC 
TATGTTGATA AGGAAAATGG AGAACCAC3GC ACCCGTGTGG TTGCTAAGGA TGGGCTOAAG 

45 CTGGGGTCTG GACCTTCAAT CAAAGCCTTA GATGGGAGAT CTCAAGTTTC AACACCAOST 
TTTGGCAAAA CGTTCGATGC CCCACCAGCC TTACCTAAAG CTACTAGAAA GGCTTTGGGA 
ACTGTCAACA GAGCTACAGA AAAGTCTGTA AAGACCAAGG GACCCCTCAA ACAAAAACAG 
CCAAGCTTTT CTGCCAAAAA GATGACTGAG AAGACTGTTA AAGCAAAAAG CTCTGT TCCT 
GCCTCAGATG ATGOCTATCC AGAAATAGAA AAATTCTTTC CCTTCAATCC TCTAGACTTT 

50 GAGAGTTTTG ACCIGCCTGA AGAGCACCAG ATTGGGCACC TCCXXTTGAG TGGAGTGCCT 
CTCATGATCC TTGACGAGGA GAGASAGCTT GAAAAGCTOT TTCAGCTGGG CCCCCCTTCA 
CCTGTGAAGA TGCCXTTCTCC ACCATGOQAA TOCAATCTGT TGCAGTCTCC TTCAAOCATT 
CTGTCGACCC TGGATGTTQA ATCGCCACCT OTTXOCrOTG ACATAOATAT TTAAATTTCT 
TAGTGCTTCA GRQPrTOTGT GXATTTGTAT TAATAAAGCA TTCTTCAACA QAAAAAAAAA 

60 1 11 21 31 41 SI 

I I I I I I 

MATLIYVDKB KGBPGTHWA KDGLKLGSGP SIKAUJGRSQ VSTPRFOTTF DAPPALPKAT 
RKAIiGTVNRA TCKSVKTXGP LKQKQFSFSA KKMTUKl'VKA K8SVFASDOA YPEIEKFFPF 
NPIiDFBSFDL PEEHQIAHIiP IiSGVPLMIUJ EEREIiEKUQ IGPESPVXMP SPPWBSNLLQ 

65 SPSSILSTLD VELFPVCX3)I DI 

70 

1 11 21 31 41 51 

I I I I I I 

ATTCGGGGOQ AGGGAGGAGG AAGAAGCGGA GGAGGOGGCT CXZCGCTCGCA GGGCCXSTGCA 

75 CCTGCCXX3CC CGCCOGCTCG CTCGCTOGCX: CGCCGOQCCG CGCTGCOjAC CX3CCAGCATQ 
CTGCCOAGAG TCGGCTGCCC CGCGCTGCCQ CTGCCGCCGC CGCCGCTGCT GCCX3CTGCTG 
CXMCTGCTGC TGCTGCTACT GGGOGCGAGT GGOGGCGGCG GCGGGGCGCG CX3CX3GAGGTG 
areXTCOGCT GCCOGCOCIO CACACCCQAG CX30CTQ6CC6 CCTGC6GGCC CCOSCCGGTT 
GCGCCX3CCO0 OOOaSSTOGC OOCAiSIGaCC GGAGGOGCCC GCATGCCATO GGaOGAGCTC 

80 GTCCGGGAGC CGGGCTOOBG CTGCTGCrOG GICTGCGCCC GGCrGGAOGG CGAGGOGIGC 
QOOGTCTACA CCCOGOGCTG CGGCCAGGGG CTGCGCTGCT ATOCCCACKC GGGCTC03AG 
CTGCCXXTTGC AGGCGCTGGT CATGGGCGAG GGCACTTGTG AGAAGCGCC6 GGACGCCGAG 
TATGGCGCCA GCKOGGAGCA GGTTGCAGAC AATGGCX3ATG ACCACTCAGA AGGAGGCCTG 
GTGGAGAACC AOGTOGACAG CACCATGAAC ATGTTGGGCX3 GGGGAGGCAG TGCTGGCCGG 

85 AAGCCCCTCA AGTOBGGTAT GAAGGAGCTG GCCGTGTTCC GGGAGAAGGT CACTGAGCAG 
CACOGGCAGA TGGGCAAGGG TGGCAAGCAT CACCTTGGCC TGGAGGAGCC CAAGAAGCTG 
CGACXACCCC CTGCCAGGAC TCCCIGCOA. CAGGAACTGG ACCAGGTOCT GGAGCXMATC 
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TCCACX»TGC GCCTTCCGQA TGAGCGGGGC CCTCTGGAGC ACCTCTACTC CCTGCACATC 900 

CCCAACTGTG ACAAGCATGG CCTGTACAAC CTCAAACAGT GC3UXGATGTC TCTGAACGGG 960 

CAGCGTGGGG AGTGCTGGTG TGTGAACCCC AACACCGGGA AGCTGATCCA GGGAGCCCCC 1020 

ACCATCCQGG GGOACCCCQA QTGTCATCTC TTCTACAATQ AGCAGCAGGA GGCTTGCGGG 1080 

5 GTGCACACCX: AGCGGAXGCA CTAOACCOCA OCCAGCCGGT GCCTGQCGCX: CCTGCCCCCC 1140 

GCCCCTCTCC AAACACXX3GC AGAAAACGGA GAGTGCTTGG GTGGTGGGTG CTOtSAQGKtT 1200 

TTCOVGTTCT GACACACGTA TTTATATTTC GAAAOAGACC AGCACCQAGC TOQGCACCTC 1360 

CCOGGCCTCT CTCTTCCCAG CTGCAGATGC CRCACCTGCT CCTTCTTGCr TTCCCCGGGG 1320 

, _ GAGGAAGGGG GTTGTGQTCG GGGAGCK^ GTACAGGrTT GGG6AGGGGG AAGAGAAATT 1380 
10 TTTATrTtTG AACCCCIGTG TCCCTTTTGC ATAAOATTAA AGOAAGOAAA AGT 



80 



i i I I I 1 

MLPRVGCSAL PLPPFPIJJL LPLLLI.LLGA SGGGGGARAE VI>FRCPPCTP ERLAACXSPPP 
VAP^AAVAAV AGOASMPCAE LVREPOCGCC SVCARLBQBA CGVYTPRGGQ GLRCYPHPGS 
EIiPIiQALVHS KQTCEKRRDA BYGASPEQVA DNGDDHSEGG LVENHVDSTM NMLGGC3QSAO 
20 SXFUCSGHKB lAVFREKWTE QHRQHOXGGK HHUSLBBPKK lAPPPARTPC QQEIjDQVUSR 
ISTMitI.PDBR 6PI.BHI.YSUI IPNCDKHQLY MLKQCSCMSLN GQSQBCHCVM PNTOKI.IQGA 
PTIRGDPECH I.FYHEQQEAC QVHTQSKQ 

Seq ID HO I 3"?0 IMA sequence 
25 nucleic Acid Accession »> im_004364 
Coding sequence: 6-440 

1 . 11 21 ■ 31 41 51 

1 I I • ' • 

30 GGAACATGGC GGATCGGCTC ACGCAGCTTC AJ3GACX3CTGT QAATTCGCTT GCAGATCAGT 
TTTGTAATGC CATTGGAGTA TTGCAGCAAT GTGGTCCTCC TGCCTCTTTC AATAATATTC 
AGACAGCAAT TAACAAAGAC CAGCXaWJCTA ACCXTACAGA AGAGTATGCX: CAGCTTmS 
CAGCACTGAT TGCAOGAACA GCAAAAGACA TTGATGTTTT GATA<3ATTCC TTACCCAGTG 
^_ AAOAATCIAC AGCTOCtTTA CaWSGCTOCTA QCTRSTATftA GCTAOAAfiAA. GAAAAOCATG 
35 AAGCIGCrAC ATSTGreOAG aATOTTSTTT ATOSAOGACaV CATGCTTCTG GAGAAGATAC 
AAAGOGCACT TGCTOATATT OCACASTCAC AaCTGAAGAC AAQAAOTGQT ACCCATAQCC 
AGTCTCTTOC AGACTCATAG CATCAGTGGA TACCATGTGG CTQAflAAAAG AACTGTTTGA 
GTGCCATITAA GAATTCTGCA TCAQACTTAG A1ACAAGCCT TACCAACAAT TACAGAAACA 
TTAAAC31CTA TGACACATTA CCTTTTTAGC TATTTTTAAT AGTCTTCTAT TTTCACTCTT 
40 GATAAGCTTA TAAATCATGA TTGAATCAGC TTTAAAGCAT CATACCATCA TTTTTTAACT 
OAGTGAAATT ATTAAGGCAT GTAATACATT AATGAAOVTA ATATAAGGAA ACATATGTAA 
AATTCrerrA TGACATAATT TA-rGTCrCCA TTTTQTTGTA TrGGCCAOTA CTTTTACAAT 



cn I 1 I I I • 

50 MADRLTQLQD AVMSLADQFC NAIGVLQQCG PPASFNNIQT AINKDQPAHP TEEYAQIiFAA 60 

IjIARTAKDID VLIDSLPSBE STAALQAASL YKLEEENHEA ATCVEDWYR GDMU£KZQS 120 
ALADIAQSQIi KIRSGTHSQS LPDS 

Seq ID kOj 372 DMA sequence 
55 Nucleic Acid Accession S: AJ271091 
Coding sequence: 1-1113 

1 11 21 31 41 51 

,n I ' I I ' ' 

OO ATGGAGAATC AGQTGTTGAC GCCGCATGTC TACTGGGCTC AGCGACACCG CGAGCTATAT 60 

CTGCGCGTGO AGCTGAQTOA CQTACA8AAC CCTGCCATCA GCATCACTGA AAACGTGCTG 120 

CATTTCRAAG CTCAAGGACA TGGTGCCAAA GGAGACAATG TCTATGAATT TCACCTGQAG 180 

TTCTTAGACC TTCTGAAACC AGAGCCTGTT TACAAACTGA CCCAGAGGCA GGTAAACATT 240 

ACAGTACAOA AGAAAGT6A6 TCAGTGGTGQ GASAGACTCA CAAAGCAG6A AAAGCGACCA 300 

65 CJ G TTTTTGG CTOCIQACTT TGATaSTTGG CTGGAT6AAT CTGATGCGGA AATSGA6CTC 360 

AGAGCXAAOO AAGAAOAGOQ CCTAAATAAA CTCCQACTGO AAAGOQAAGe CTCTCCTSAA 420 

ACTCITACAA ACTTAACX3AA AOGATACCTa TTTATOTATA ATCTTeiQCA ATTCTTGOGA 480 

TTCTCCTGOA TCTTTGTCAA CCTGACTGTG OSATTCT6TA TCTTCGGAAA ASACTCCTTT 540 

TATQACACAT TCCATACTGT GQCTQACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

70 GAAACTATCA ATGCAGCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTQATCCAG 6S0 

CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATGGAAGA AATGCAGAAC 720 

AAAGCTGT60 TTTTCITTGT GTTTTATTTO TGGAGTGCAA ■rTGAAATTTT CAGGTACTCT 780 

XTCTACATGC TGAOGTGCAX TGACATGGAT TGGAAGGTGC TCACATCGCT rCGTTACACT 840 

CTGTGGATTC CCTTATATCC ACTGGOATGT TTGGCGGAAQ CTGTCTCAGT GATTCAGTCC 900 

75 ATTCCAATAT TCAAIQAOAC COGAOSATTC AQTTTCACAT TGOCATATCC AOTGAAAATC 960 

AAAGTTAGAT riTOC'lTrrT TCTXCAGATX TATCTTATAA TOATATTTTT AQOTTTATAC 1020 

ATAAATTTTC GTCRCCTTTA TAAACAGCGC- AGACTGAAAA TGAGGGCAGG CQCAGTGGCT 1080 
CAIGCCieiO ATCCCAGCGC TTTCGQAGGC TGA 



I I I I I I 

MEHQVIiTPBV YKAQRRREXiY UtVEIiSDV^ PAISITEHVI. RFKAQGHQAK GDMVYEFHLE 
FLDLVKPEPV YKrjTQRQVNl TVQKKVSQWH ERLTKC2BKRP LPUVPDFORH UDESOAEHGI. 
RAKEBGSLHX LRLESE6SPB TL!INI>ftK6YL EmNLVQFI/G FSHXFVNI.TV RFCILGKESF 
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-miEBTVADM M3rF0QMIiS.W ETINAAIGVT TSPVliPSMQ LLGSHFILFI IPGTMHEmaJ 240 

KAWFFVFYIi HSAIBIFRYS FYMLTCIDMD WKVLTWIOIYT I.HIPLYPU3C lAEAVSVIQS 300 

IPIEMETOTP SFTLPYPVKI KVRFSPPLQI YLIMIFLGLY INFRHLYICQR RLKMRAGAVR 360 
^ HMCDPSALQG 

Seq ID NO: 374 DNA sequence 
IluclelC Acid Accession «•. HM_016395 
Coding sequence: 1-1113 

1 11 21 31 41 SI 

I I I ) I I 

ATOQAOAATC ASGTGTTGAC GCOGCATGTC TACTGGGCTC AGCGACACCG CGAQCTATAT 60 

CTGCGCGTGQ AGCTGAOTGA CGTACAGAAC CCTGCCATCA GCATCACTGA AAACGTGCTG 120 

15 CaVTrrCAAAG CTCAAGQRCA TGGTGCCAAA GGAOACAATG TCTATGAATT TCACCTGGAG 180 

TTCTTAGACC TTGTGAAACC AGAGCCTGTT TACAAACTGA CCCAGAGGCA GGTAAACATT 240 

ACAGTACAGA AGAAAGTGAG TCAGTGGTGG GAGAGACTCA CAAAGCAGGA AAAGCGACCA 300 

CTGTTTTTGG CTCCTOACTT TGATCGTTGG CTGOATQAAT CTGATGCGGA AATGGAGCTC 360 

AQAGCTAAGQ AAQAAGAGCG CCTAAATAAA CTCC6ACTGO AAAGCGAAGQ CTCTCCTGAA 420 

20 ACTCTTACAA ACTTAAGQAA AGGATACCTQ TTTATGTATA ATCTTGTGCA ATTCTTGGGA 480 

TTCTCCrGGA TCTTTGTCaA CCTGACTGTG OSATTCTGTA TCTTGGOAAA AGAGTCCTTT 540 

TATCACACAT TCX3VTACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

GAAACTATCA ATGCAGCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTGATCCAG 660 

CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATGQAAQA AATGCAGAAC 720 

25 AAAGCTGTGG TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 780 

TrCTACStTGC TGRCS3TGCAT TGACATGGAT TGGAAGGTGC TCACATGGCT TCGTTACACT 840 

CTGTGGATTC CCTTATATCC ACTGGGATGT TTGGCGGAAG CTQTCTCAGT GATTCAGTCC 900 

ATTCCAATAT TCAATGAGAC OSGAOGATTC AGTTTCACAT TGCCATATCC AGTGAAAATC 960 

AAA8ITAGAT TrfCCrTTTT TCTTCAOATT TATCTTATAA TGATATTTTT AGGTTTATAC 1020 

30 ATAAATTTTC QTCaCCTTTA TAAACAOCGC AOACTGAAAA TGAGGaCAOa CQCAGTGGCT 1080 

CATGCCTGTG ATCXXAOCGC TTTGGGAGGC TGA 

Seq ID HOi 375 Protein sequence 
35 Protein Accession >: NP_0S7479 

1 11 21 31 41 51 

I I I I I I 

HBHOVLTPHV YWAQRHRELY LRVE1.SDVQN PAISITEHVl, KFKAQGHGAK GnNWEFKLE 60 

40 FLDLVKPBPV YKLTQRQVNI TVQKKVSQWW ERLTRgBKRP LPLAPDFDRW LDBSDAEMEL 120 

BAKBEBRliNK LRLESEGSPB TItTNIiRKQYI. PMYNI.VQFLG FSHIFVHLTV RPCILGKESF 180 

YDTFHTVADM MYFCQMLAW ETINAAIGVT TSPVI.PSI1IQ LLGBNFILFI IPGTMEEMQN 240 

KAWFFVFYL WSAIEIPRYS PYMIiTCIDMD WKVLTMLRYT LHIPLYPUK: LVEAVSVIQS 300 

IPIFNETGRF SFTI.PVPVKI IWRFSPPLQI YLIMIFLGIiY INFRHIiYKQR RRKYOKKRKH 360 
STKKKDLDGF LFV 



Seq ID NO I 376 DHA sequence 
nucleic Acid Accession #> NM_005987 
Coding sequence: 1-270 



I I I I I I 

ATGAATTCTC ACCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA GCAGCA6CAG 
OTGAAACAAC CTTGCCAGCC TCCACCCCAG GAACCATQCA TCCCCAAAAC CAAOSAGCCC 
55 TGCCAACCCA AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG CCAGCCCAAG 
ATTCCAGAGC CCTGCCAGCC CAAGGTGCCT OAQCCCTGCC CTTCAAOGGT CACTOCAGCA 
CCAGCCCAGC AGAAGACCAA GCAGAAGTAA 

60 

1 11 21 31 41 51 

i I I I I I 

MNSQQQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP OQPKVPBPCH PKVPBPCQPK 
65 IPEPOQPK7P EPCPSTVTPA PAQQKTKQK 

Seq ID HO: 378 DMA sequence 
nucleic Acid Accession #1 NM_00210S 
70 Coding sequence: 74-S05 

1 11 21 31 41 51 

I I I I I I „ 

ACAGCAGTTA CACTGCGGCG GGCGTCTGTT CTAGTGTTTG AGCCGTCGTG CTTCACCGQT 
75 CTACCXCGCT AGCATGTCGG OCCGOGGCAA GACTGGCGGC AAGGCCCGCG CCAAGGCCAA 
GTCGOGCTGG TCGCGCGCCG GCCTCCAGTT CCCAGTGGGC CGTGTACACC GGCTGCTGOG 
GAAGGGCCAC TA06C0GAGC 60GTTGG0GC CGGCGCGCXa GTGTACCTGG CGGCAGTGCT 
OQA6TAOCTC ACCGCTGAQA TCCTGGAGCT GGCGGGCAAT GCGGCCCGOG ACAACAAGAA 
GACQOOAATC ATCCCCOGCC ACCIGCAGCT GGCCATCCGC AACGACQAGG AfiCTCAACAA 

80 QCTOcroaac qgcoioacga tcgcccaogg aggcgtccto cccaacatcc aggccgtgct 

GCIGCCCAAG AAGACCAGCG CCACOGTGGG GCCGAAGGCG CCCTCGGGOG GCAAGAAGGC 
CACCCAGGCC TCCCRGGAGT ACTAAGAGGG CCCGCGCCGC GGCOQGCOGC CCCAGCTCCC 
CATGCCACCA CAAAGGCCCT TTTAAGGGCC ACCACOGCCC TCATGGAAAG A6CTGAQCCG 
_ CTTCAGACTG CGGGGCAAGC GGGCCGCGGC TCCCTTCCCC TCCCCTCCCC TCGCCCQCCT 

85 T0GCCGCXXX3 GCCTCGAGTC CCCGCCCGCC CCCGCTCCCG TCCCGCACCQ CCTGCOOCQT 
CGGCCTCQGQ CCTGCCCTGT CCGCCGTCCG CCCTCCGGTA GGGTTGGGGC CTTCCGOATG 
CGGCTTGGGC GCTCTTCGGG GACCTCOCrTO GCGOGOAAOA CCCQAGCCTQ COGGGGQQAG 
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OBGCXSGCG CCGCACCIGC 
GCTAAGGGGC TGOQGGOAOG 
CAGGGCCGAO GTGGGCAGTC 
CXSGCWSCTGC AGOMGGGTG 
AGACGGCCXSC TGGCCGGGAG 
GCCXXTTCTG CGGCCXX3GAC 
AGGTCTGCGC TGGGGCCGGG 
CTGCCTCCTA GGAGQACATT 
CTATCrrGGAC AGCRAGAGTC 
CCGACGCOGC CCCATTTCCC 
CAGCACAAST CGGTTAATCC 



COGCAGCACC 1 



T CTGG TACCC 
GCTTTGGTGG 
CCAGGCCTTT 
ACGAAGCACT 
TAGGGGAGGG 
GTTTTGCGGA 
TTCCAGCAAA 
CTGTCTGGAC 
ACTAlGAACCT 



TTTATTAAAG GATTGTITTT TTTTT 



CACATCAGCT 
TGGTAACAGG 
CAGAGGCCTG 
AOSCGACTGG 
CTCAACTCGG 
TGAGCCTCCG 
TAQGCATTQG 



CTTGGCCTTC 
CTGAAGGTGA 
GTGCTTAGCC 
ATCGCOGATT 

crccxrrccAT 

CACATCTTCC 
CAGTTTGGCT 
CAGCCAGGCC 
CAATCCAAGC 
TTGGCTTCTG 



:: ATCCOGAGTC 900 



PCT/US02/12476 



rrAG 



OTGAOGCCCT 
CRGQACTTTC 
TCGGTCTGGC 
CTTCyVTTCAT 
TCCCGAGTGA 
TCACGGCTGG 
TGTCGGGCXX: 
ACCTAGATAC 
AACTGGAATT 
ATGQACTAAT 



1020 
1080 
1140 



I I 
AERVOAGAPV YLJ 
VTIAQGGVIf NIQAVLLPHX 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



Seq ID KO: 380 ONA sequence 
Kuclelc Acid Accession 6: AL136942 
Coding secjuence: 184-Bfi4 

1 11 21 31 



CAG6CGAGGC GGTCQACGCT 
GCGATGAAGA TGGTOSCGCC 
CATGTCCX3CA CCGGCACCAT 
CTGTTGATTT TATTGAGTGC 
ACTTTGAGTT 
TCCTQATATG 
CATTCTTCTG 
TTATTTATCC 
GAGATGATOT 
GCATTATCTT 
TCAATGGTAG 
•roCTACCCCC 
ACGTGTCTGC 
ATCTQAGCAA 
CTGAAATGCT 
CTTTGCTAGA 



CCTGAAAACT 
CTGGACGCGG 
CCTGCTCGGC 
CCTGGCTGAT 



CTTCTCATGA 
TGGATCATCC 
ATCACTGTGC 
TTTCCCTACA 
CTCTTTATTA 



ACTACGGTGC 
CCGCCACCTT 
CTTTGCAOAC 
TTGTTTGTTG 
TCAACATATG 
TGGGGATATA 
GGACCTAGAA 
TCCCTCTCTT 
CCAOCRTAOA 
ATTAGGTAAR 
TGACTQAAGT 
TAAGACCATT 
GATCTTGTGT 
GTGTTTGGCG 
GGCCCGCTTT 



TGCTATGGCT 
TTACCAGATC 
AAACTCCATT 
CATGTCAGTG 
GACTTTTAAG 
GAACTCCTCT 
eTATQATOAT 



I 

GGTATCGAGQ 
AGGAGCCGGC 
TGCGCGCGCG 
TTCTACTCCA 
GTCTGGTATC 
CCGGATCAGT 
GCCAACATGT 
ACTTACGGAG 
TTTGACTTTQ 
CAGGAATACA 
AATCCTACCT 
GGTTACTTGA 



TAGTTCTGTT 
ACTTTTTAAA 
ACACTGTGAT 
CTAACCTTCC 
GTACCTGCTG 



ATTTCACTTT 
ATTTAGATGT 
AGATTAACTG 
CTAGGCATTG 
OGCCCCAAAG 
CAAAAATAOA 



CTGGCGCCAC 
ACAQCTGCTO 
TGATCaVTCAA 
ATAACTTTTC 
GCATTGCCAT 
CGTACAAGCA 
CCCTGAACAT 
TACGGCAACT 
GTTTGGTCCT 
TTAGCTGTOT 
TTTATGTTAC 
ATGGTGCTGC 
AQCTGAGG6C 
TQCCATGAGC 
TAGATTGAAA 
TAGAATTCTT 



TTGQGCATTT 
CAACTTTTTC 
ATTOTOTAAT 

TAGAAGTCCT TATGTATGTQ TTACRAGAAT TTCCCCCACA 



TCAATGACAQ 
AGAAAGCACC 
CCAGGGACAT 
CTGCATGGGA 
TACTAAGTGT 
AATTGGGATA 



GGGGTOACAT 

TCTGCCCTAG 
TATTTGATAT 
TCCTACTOCT 
AAAAATQAGG 



AGCAGTGACC 
GCCTCGTATG 
CTCTTCTCCT 
ATTGGTTCAA 
ACTTCTGCCT 
TTGAACTTCC 
ATTGCCTTCC 
ATTCATATGT 
TAAGTCGTTT 



ATTTTCTCCA 
ATCTACTGAC 
TGTTAGAGGG 
GGATTCACAT 
GaAGGTCATC 



AAGTTCTGAA 
TGCGATTTCT 
ACGCGCAGCC 
GTTGGTTGCA 
GCCTCCTAAT 
TATTATTCTT 
TT6QAACTGC 
CAGCAATGAC 
CAAGGAGCCA 
AGCAGCTTQA 
CTCTCTGAGC 
ACTGTAGTTT 
CCTGTACQAT 
CAAATCTQAT 
TTCTCTCTGT 
TTCAGCCATT 

ACATCCTTTA 
TGGCCTGAAT 



TGTK 



11 



21 



31 



AAGTATGTCr AGTCACCTTT 
TTQTATGCGC TTTTTACCTT 
TACAAAGTCA QCAACTCTCC 
GCAATTAAAA CAAGGTTTGC 



SI 



1020 

loao 

1140 
1200 
12fiO 
1320 

1440 
1500 
1S60 
1620 
1680 
1740 

laoo 

1860 
1920 
1980 



I I I I I I 

MKMVAPHTRF YSKSCCLCCH VRTCTILLGV WYLIINAWI. IiIIiSALADP DQYNFSSSBL 
GGDPEFMDDA MMCIAIAISL LMILICAMAT YGAYKORAAW IIPFPCYQIF DFALNMLVAI 
TVLIYPHSIQ EYIRQMPKF PYRDDVMSVK PTCLVI1III.L PISIILTFX6 YLISCVMHCY 
RYIHGRNSSO VliVYVTSNDT TVLLPPYDDA TVNGAAKEPP PPYVSA 



80 



I I I I I i 

CaCATGCCAG AAGAACACTQ TTGCTCTTGG TGGAC33GGCC CAGAGGAATT CAGAGTTAAA 
CCTTGAGTGC CTGCGTCCGT GAGAATTCAG CATGGAATGT CTCTACTATT TCCTGGGATT 
TCTGCTCCTG GCTGCAAGAT TGCCACTTGA TGCCGCCAAA COATTTCATG AT6TGCTGG6 
CAATGAAAGA CCTTCTGCTT ACATGAGGGA GCACAATCAA TTAAATOGCT GOTCTTCTGA 
TGAAAATGAC TGGAATGAAA AACTCTACCX: AGTQTGOAAQ OGGGGABACA .TOAGS TGGAA 
AAACTCCroa AAGGGAGGCC GTGTGCAGGC GGTCCTOACC AOIQACTCnC CA6CCCTCBT 
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GGGCTCAAAT ATAACRTTTG OGGTGAAGCT OATATTCCCT AQATGCCAAA AGOAAGATGC 420 

CAATG6CAAC ATAGTCTATG AGAAGAACTG CAGAAATGAG OCTGGTTTAT CTGCTOATCC 480 

AIATOTTTAC AACTGGACAQ CATQQTCASA GGACAOTGAC GGGGAAAATG GCACCGGCCA S40 

AAGCCATCAT AACGTCTTCC CTOATGGGAA ACCTTTTCCT CACCACCCCG GATGGAGAAQ 600 

ATQGAATTTC ATCTACGTCT TCCACACACT ■TOGTCAGTAT TTCCAGAAAT TGGGACGATG 660 

TTCAGTGAGA GTTTCTGTGA ACACAGCCAA TGTGACACTT GGGCCTCAAC TCATGGAAGT 720 

GACTGTCTAC AGAAGACATG GACGGGCATA TGTTCCCATC GCACAAGTGA AAGATGTGTA 780 

CX3TGGTAAOV GATCAGATTC CTGTGTTTGT GACTATGTTC CAGAAGAACG ATCQAAATTC B4 0 

ATCCGACX3AA ACCTTCCTCA AAGATCTCCC CATTATGTTT GATGTCCTGA TTCATGATXTC 900 

TAGCCACTTC CTCAATTATT CTACCATTAA CTACAAGTGO AGCTTOGGGG ATAATACTGG 960 

CCTOTTTGTT TCCACCRATC ATACTOTSAA TCACAOSTAT GTGCTCAATQ GAACCTTCAG 1020 

CCTTAACCTC ACIGTaAAAG CTGCAGCACC AGGACCTTGT COGCCACCGC CACCACX^CC 1080 

CAGACCTTCA AAACCCACCC CTTCTTTAGG ACCTGCTCGT GACAACCCCC TGGAGCTGAG 1140 

TAGGATTCCT GATGAAAACT GCCaWSATTAA CAGATATGGC CaCTTTCAAG GCACCATCAC 1200 

AATTGTAGAG GGAATCTTAC3 AGGTTAACAT CATCCAGATG ACAGACX3TCC TGATGCXXX3T 1260 

GCCATGGCCT GAAAGCTCCC TAATAGACTT TGTCGTGACX: TGCCAAGGGA GCATTCCCAC 1320 

GGAGGTCTGT ACCATCATTT CTGACCCCAC CTGCX3AGATC ACCCAGAACA CAGTCTGCAG 1380 

CCCTGTGQAT GTGGATGAGA TGTGTCTGCT QACT6TGAGA CGAACCTTCA ATGGGTCTGG 1440 

GACGTACTOT 6TGAACCTCA CCCTGQGGGA TGAOVCAAGC CTGGCTCTCA OGAGCACCXTT 1500 

CATTTCTGTT CCTGACAGAG ACCXMCCTC GCCTTTAAGG ATGGC3«ACa GTGCCCTGAT 1560 

CrCCaTTGOC TCCTTGQCCA IATTTGTCAC TGTOATCTCC CTCTTGGTOT ACAAAAAACA 1620 

CAAGGAATAC AACCOVATAQ AAAATAGTCC TGQaAATGTa GTCAGAAQCA AAGGCCTGAG 1680 

TGTCTTTCrC AACCCTGCAA AMCCX3TGTT CTTCCOGGGA AACCAGGAAA AGQATCGGCT 1740 

ACTCAAAAAC CAAGAATTTA AAGGAGTTTC TTAAATTTCG ACCTTGTTTC TGAAGCTCAC 1800 

TrrrCAGTGC CATTGATGTG AGATGTGCTG GAGTGGCTAT TAACCTTTTT TTCCTAAAGA 1860 

TTATTOTTAA ATAGATATTG TGGTTTGGGG AAGTTGAATT TTTTATAGGT TAAATGTCAT 1920 

TTTAOAGATa OQaAGaGGGA TTATACTGCA GGCAGCTTCA GCCATGTTGT GAAACTGATA 1980 

AAAGCAACIT AGCAAGGCTT CTTTTCATTA TTTTTTATaX TTCACTTATA AAGTCTTAGG 2040 

TAACTAGTAQ GATAGAAACA CTGTGTCCOG AGAGTAACGA GAGAAGCTAC TATTGATTAQ 2100 

AGCCTAAOCC AGGTTAACI6 CAAOAAGAGG OSGGATACTT TCAGCTTTCC ATGTAACTGT 21S0 

ATCCATAAAG CCAATGTAGT CCAGTTTCTA AGATCATGTT CCAAGCTAAC TQAATCCCAC 2220 

TTCAATACAC ACTCaTQAAC TCCTGATGGA ACAATAACAG GtXXaAGCCT GTQGTATGAT 2280 

GTGCACACTT GCTAGACTCA QAAAAAATAC TACTCTCATA AATGGGTQG6 AGTATTTTGG 2340 

TGACAACCTA CTTTGCTTGO CTGAGTGAAG GAATGATATT CATATATTCA TTTATTCCAT 2400 

GOACavTTTAa TTAGTGCTTT TTATATACCA GGCATGATGC TGAGTGACAC TCTTGTGTAT 2460 

ATTTCCAAAT TTTTGTATAG TCGCTGCACA TATTTOAAAT CATATATTAA GACTTTCCAA 2520 

AGATGAGGTC CCTGOTTTTT CATGGCAACT TGATCAGTAA G6ATTTCACC TCTOTTrGTA 2580 

ACTAAAACCA TCTACTATAT GTTAGACATO ACATTCTTTT TCTCTCCTTC CTGAAAAATA 2S40 
AAGTGTGGGA AGAGACAAAA AAAAAAAAA 

Seq ID HO I 383 Protein sequence 
Protein Accession ft: me_002501 

1 11 21 31 41 51 

I I I I I I 

MECLYYFLGF LLLAARLPLD AAKRFHDVLG NERPSAYMRB HNQLNGWSSD ENDWNEKLYP 60 

VWKRCaJMRWK NSWKGGRVQA VLTSDSPALV GSKTITPAVNIj IPPRCQKEDA MGNIVYEKNC 120 

HNEAGLSADP YVYNWTAWSE DSDGEaTOTGQ SHHNVFPDGK PPPHHPGWRR WNPIYVFimi 180 

GQYFQKLGRC SVRVSVNTAN VTLGPQLMEV TVYRRHGRAY VPIAQVKDVY WTDQIPVFV 240 

TMFQKNDiaiS SDETFLKDLP IMPBVLIHDP SHFIJTltSTIH YKWSFGDNTa LFVSmHTVN 300 

HTXVLNGTFS LNLTVKAAAP GPCPPPPPPP RPSKPTPSLG PAGDNPIiELS RIPDENOQIN 360 

HYGHPQATIT IVEOILEVMI IQMTDVUWV PWPBSSLIDF WTC3QGSIPT EVCTIISDPT 420 

CEITQNTVCS PVDVDEMCbli TVRRTFtlSsa TYCVNLTLGD DTSLAIiTSTL ISVP080FAS 480 

PLRMANSALI SVOCLAIPVT VISIibVYKKH KEYNPISHSP GNWRSKOLS VPLHRAXAVP 540 
FPGNQEKDPti LKtlQBPKOVS 

Seq ID NO: 384 0NA sequence 
Nucleic Acid Accession #: NM_00li34 
Coding sequence! 48-1877 

I 11 21 31 41 51 

I I I I I t 

TCCATAITGT GCTTCCACCA CTGCCAATAA CAAAATAACT AGCAACCATG AAGTGGGTGG 60 

AATCAATTTT TTTAATTTTC CTACTAAATT TTACTGAATC CAQAACACTQ CATAGAAATG 120 

AATATGGAAT AGCTTCCATA TTGGATTCTT Aa»ATGTAC TGCAGAGATA ftGTTTAGCTG 180 

ACCTGGCTAC CATATTTTTT GCCCAGTTTG TTCAAGAAGC CACTTACAAG GAAGTAAGCA 240 

AAATGGTGAA AGATGCATTG ACTGCAATTG AGAAACCCAC TGGAGAT6AA CASTCTTCAQ 300 

GGTOrrrAGA AAACXaVGCTA (XTGCCTTTC TGGAAGAACT TTGCCATGAG AAAGAAATTT 360 

TGGASAASTA GGGACATTCA GACXGCTGCA GCCAAAQTQA AQASGGAAGA CATAACTGTT 420 

TTCTTGCACA CAAAAAQCCC ACTCCJWKaVT CX3ATCCCACT TTTOCAAGTT CCRGAACCTG 480 

TCACAAOCTO TGAAGCATAT OAAGAAOACA GGGAQACATT CaVTGAACAAA TTCATTTATQ 540 

AGATAGCAAG AAGGCATCCC TTCCTOTATG CACCTACAAT TCTTCTTTGO QCTGCTCGCT 600 

ATGACAAAAT AATTCCATCT TQCTGCAAAO CTGAAAATGC AGTTGAATGC TTCCAAACAA 660 

AGGCAGCAAC AGTTACAAAA GAATTAAGAG AAAGCAGCTT GTTAAATCAA CATGCATGTG 720 

CAGTAATGAA AAATTTTGGG ACCCX3AACTT TCCAAGCCAT AACTGTTACT AAACTGAQTC 780 

AOAAGTTTAC CAAAGTTAAT TTTACTCAAA TCCAOAAACT AGTCCTGGAT GTGGCCCATG 840 

TACATGAGCA CTGTTGCAGA GOAGATOTGC TGGATTGTCT GCAGGATGGG GAAAAAATCA 900 

TGTCCTACAT ATGTTCTCAA CAAGACACTC TGTCAAACAA AATAACAGAA TGCTGCAAAC 960 

TOACCACGCT GOAAOQTGGT CSkATGTATAA TTCASGCAGA AAATGATGAA AAACCTGAAG 1020 

GTCTATCICC AAATCTAAAC AGQTTTTTAQ QAQATAOAQA TTTTAACCAA TTTTCTTCAG 1080 

GGGAAAAAAA TATCTTCTTG GCAAGTTTTG TTCATGAATA TTCAAGAAGA CATCCTCAGC 1140 

TTGCTGTCTC AGTAATTCTA AGAGTTGCTA AAGGATACCA GGAGTTATTG GAGAAGTGTT 1200 

TCCAGACTGA AAACCCTCTT GAATGCCAAG ATAAAGGAGA AGAASAATTA CAGAAATACA 1260 

TCXS^GGAGAG CCaUVGCATTG GCAAAGCGAA GCTGCGGCCT CTTCCAGAAA CTAGGA6AAT 1320 

ATTACTTACA AAATGCGTTT CTa3TTGCTT ACACAAAGAA AGCCOCCCAQ CTGACCTCGT 1380 

CGGAGCIGAT GGGCATCACX: AGAAAAATGQ CAGCCACAGC AGCCACTTGT TGCCAACTCA 1440 

GTGAGGACAA ACEArTGaCC TQTGGCGAOa OAGCGGCTOA CATTATTATC GGAC3Un:TAT 1500 
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GTATCAGACA TOAAATGACT 
ATGCCAACAG GAGGCCATGC 
CATTCTCTGA TQACAAGrTC 
TGCAAAGGAT GAAGCAAGAG 
AGGAACAACT TGAGGCTGTC 
6CCAGGAACA GOAAGTCTGC 
CTGCTTTGGG AGTTTAAATT 
TGTGAACTTT TCTCrTTAAT 
AAAGACTTTT ATGTGAGATT 



ATTTTCCRTA 
TTTCTCATTA 
ATTGCAGATT 
TTTGCTGAAG 
ACTTCAGGGG 
TTTAACTGAT 
TCCTTATCAC 



AGGATCIGTG 
ACCTTGTGAA 
TCTCAGGCCT 
AGGGACAAAA 
AAGAGAAGAC 
TTAACACTTT 
AGAAATAAAA 



CCAQTGCTGC 
TGAAAC31TAT 
CCAAGCTCAG 
GCAAAAGOCA 
GTTGGAGAAA 
ACTGATTTCA 
AAAACGAGTC 
TTGTGAATTA 
TATCTCC3\AA 



ACTTCTTCAT 
GTCCCTCCT6 
G6TGTAGCGC 
CAAATAACAG 
-TGCTGCCAAG 



1560 
1620 
1680 
1740 
1800 
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1920 
1980 
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: FLLNFTESRT LHRNBTOIAS ILDSYOCTAE 



PTPASIPLFQ 
SCCKAENAVE 
MFTEIQKI>VIi 
GQCIIHABND 
LRVAKBYQEI. 
FIiVAyTKKAP 
TPVNPGVGQC 
EFLIIiI.VICQK 



VPBPVTSCEA YEEDRETFMN 
CFQTKAATVT KELRESSLLN 
DWAHVHEHCC RGDVLDCLQD 
EKPEGLSFNC NRFLGDSDFN 
LEKCFQTENP LECQDKGBBE 
QLTSSELMAI TRKMAATAAT 
CTSSYAMRRP CFSSIiWDET 
PQITEEQLEA VIADFSGLI.E 



SKTRAALOV 

Seq ID HO: 386 DNA sequence 

Hucleic Acid Accession »t I)M_003205.1 

Coding sequence: 1..3149 



41 

I 

ISIiADIiATIF 
EKEILEIOfGH 
KFiysiARRH 
QHACAVMKNP 
GEKIMSYICS 
QFSSGEKNIF 
LQKYIQESQA 
CXni<SEDKI.Ii 
YVPPAFSDOK 
KCCXX3QEQEV 



51 
I 

FAQFVQEATY 
SDCCSQSEEG 
PFI.YAPTILL 
GTRTFQAITV 
QQDTLSNKIT 
LASFVHEYSR 
I.AKRSCGLFQ 
ACGEGAADII 
FIFHXDLCQA 
CFABEQQKbl 



GGATTCTC3W3 
CCCAAGGCTA 
TGGGGTGCCA 
CTGGAGTCCT 
TGGTTCGGGG 



11 

I . 

GGACGCCAGA 
OGCTSSTGCC 
TAGACGCQGA 
TGGAGTTTTA 
ATACCAGCCA 
GCCCCACACA 
CACTGTCCAG 



GOACAQGQTT 
TTAGGTGGAC 
ATTGCAGAAT 
CGCCAGGCCA 
TTCAGTGGTG 
GGCTATGTCA 
CAGATGGCCT 
GATQACTTGC 



CAGGAAGCTA 
CTTATTACCC 
OrrCCATCTA 
ATGACACAGA 
CCATCCTTAA 
CCTACTTTGG 



GCCAGGAGTG 
QTGCACCOCC 
CTCAGAGGGA 
AGCCCATGGC 
OCCACTGAGC 
GGAGTATGCA 
AGGCTTCAGT 
TTTCTGGCAA 
CGAGTACCTG 
TGATGACAGC 



31 
I 

CACGCCGTGC 
CTGCTSSTGC 
GTACTCTCGQ 
ACAGACGGGG 
CTGCAGGGTG 
ATTGAATTTG 
GAGGAGCCTO 
TCCTCCATCT 



41 



SI 



TGGCTCAGAC 
CTATGCASTQ 
ACXCCTGCTC 



6CCQAGTTCA 
GGCCAGATCC 
ATCSVACCTGG 
TACCTAGGAT 
GCTGGTGTGC 
ATTCGATCCC 



AT6GATCGGA 



GAGGTG6GCA GGGTCTAOGT CTACCTGCAG CACCC3U3CCG GCATASA6CC CACGCCCACC 



CAGGTTCTGC 
CGAGGAGGCC 
GTGGACAAGG 
ATCTTCCCCG 



CIGTTCCTSQ 



CTCTCGCOSA 
CACQGCCTCA 
ATCTTGCTGQ 
GGGGAGCAGA 



ATCCAGTTTG 



GAGGCAGTGC 
GACCTGGGAC 
AGCCAGGGTQ 
GTGACCAOAQ 
GAGTTGGATC 
TCTGCTTCCT 
TGTGAGCTC3G 
TGGGCCAAGA 
TACAAAGCCC 



AGGATGGCIA 
TAGTGTTTQT 
AGCCCCTGTG 
GAGACCTGGA 
CTGTGGTATA 
CCATGTTCAA 
ACCTTAGCTT 
TGGAACTTCA 
CCTCCAGGCA 
GCAQAGAGAT 
TTCACATOGC 
GGCCAGCCCT 
ACTGTGGAGA 
ACCATGTGTA 



ACTTCCSVOAT 
TCTCCGTGQA 
TATTCCC3U3T 
CTGCTGTCCA 
TGCTGGAACT 
TTACQGGACT 
CCX5AGGGTTC 
CGGGACCTCA 
GGCCCCTGCA 
CTTTCTTGCA 
TGAAGATGCC 



CAATGATCrra 
ATTTCCTGGG 
GGCAGCCAGC 
TGGCAATGGA 



gccaiggqgg ctccctttgg 
GGaxaGGAG ggctgggctc 

CACACCCCAG ACTTCTTTGG 
TATCCTGATC tgattgtggg 
CCCATOGTGT CCGCTAGTGC 
CGGAGCTOCA GCTTAGAGGG 
GCTTCTGGAA AACACGTTGC 



QAAGATCTAC 
TCTCAACTTC 
ACATTATCy«3 



CCTGGGTGAC 
CQCCTATGAG 
CAGACACCCA 
OCTGCTG6TG 
TCQGTTTACA 
CCTCAGCAAQ 
GGCTCAGGCC 
AAGCGACTGG 
CCATGTCTAT 
CAGCTGTCCC 
CAACTGCACC 
CCTGCACCAC 
GATCCTGAAA 
CCAACAAGAQ 



ACCCAGACCC 
CTCAGGAACG 
TCCTTGGACC 
AGCAAGAGCC 
TGTGTGCCTG 
AAGAATGCCC 
GCTGAGCTTC 



AGCTGCGCTQ GOGCCCCCXSQ 60 
CGCC93CCACC CAGGGTCGGQ 120 
GGCCCCCXX3G CTCCITCTTC 180 
TCAQTGTGCT GGTGGGA6CA 240 
QTGCTGTCTA CCTCTGTCCT 300 
ACAGCAAAGG CTCTCGGCTC 3 60 
TGGAGTACAA GTCCTTGCAG 420 
TGGCftTGOGC TCXACTGTAC 480 
GCACCTGCTA CCTCTCCACA 540 
CAGATTTCA6 CTGGGCAGCA 600 
CCAAGACTGG C06TGTGGTT 660 
TOTCTGCCAC TCAGQAGCM 720 
TTCAGGGGCA GCTGCaGACT 780 
ACTCTGTGGC TGTTGGTGAA 840 
CCAAAGGGAA CCTCACTTAC 900 
TCTACAACTT CTCAGGGGAA 960 
ACGTCAATGG GOACGGGCTG 1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 



ATCATCATCC 
TACAAOCrra 
CrCAAGCCTC 



TAGCCATCCT 
GATTCTTCAA 
CAGCCACCTC 



CTACCXSAATC 
ATGGACCAAG 
GTTTGGCCTC 
ACGCTCCCTC 
TGATGCCIGA 



IGTGACCTGG 
GTCCCTCATC 
AATCTCAACA 
CAGGTCACCC 
CATCCCCQAG 
GAGCTCATCA 
CAGGCTCTGG 
ACCAATCACC 
CAGCAASAAC 
TGCCCGGAGG 
AGCCAAAGTC 
CAGCCATTTA 
CT6CCT0GGC 
6CAQAAQGCA 
CTGCTCCTAG 
CCATATGGCA 



TGCTCATCCA 
AGTCAGAATT 
CCCRAGCCCC 
GGATAGAGGA 
ACCIGCAGCr 
TGAACCTCAC 
GGGTCACCGC 
CCAGCCTGAG 
GCAACCOCAT 
TCCGGGACAC 
ACTCGCAAAG 
TGAACGGTGT 
ACXMCCrCA 
ACCAAGGCCC 
AAGGTCAGCA 
CCATTAACCC 
GGGAAGCTCC 
CTGAGTGTTT 
TGCAGTTGCA 
GCCTGCAGTG 
AGCTGCaXA 



TAAGCCTTCC 
CTCTGCCCTT 
GTCCTTTGGT 
CTCCCTCACX: 
GAACCCTGTG 
TGACrCCATT 

GAATGGGGCT 
TCGAGACAAA 
AGTGGACAGC 
CAAGGCTCAG 
GOAAGTGTTT 
TTTCCATGCC 
CCCTCCAGAG 
CTGTGACTAC 
GAAGGCAGGA 



CGACGTGGTT 



GAAGGAGGAG 
CAGCTCCATT 
GCTCCTATAT 
AAAGGGCCTG 
AAGCCGCAGC 



•rrrccGAGTC 

TQAGGCTGTQ 
AAAAGAOCGT 
CCCACTGTGG 
GTCTACTCAT CTACATCCTC 
CCSCCATGGA AAAAGCTCAlG 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2680 
2940 
3000 
3060 
3120 
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5 1 11 21 31 41 SI 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPIiPLLL U,LPPPPRVG GPNLDAEAPA VLSGPPGSPP 60 

QFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYljCP WGASPTQCTP lEFDSKGSRIi 120 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACRPLY SMRTEKEPLS DPVGTOfLST 180 

10 DNFTRILBYA PCRSOFSHAA GQGYOQGGFS AEFTKTORW LGGPGSYFWQ GQII.SATQBQ 240 

lAESYVPEYL 'IHLVQGQLQT RQASSIYDDS YLGYSVAVGB PSGDDTEDPV AaVPI«3II.Ty 300 

GYVnUIQSD IRSIfYKFSGE QKASYFGYAV AATDVNGDGb DDLLVQAPU. MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSaTPLG DLDQDGYIIDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLMAAS HTPDPFGSAL RGGRDLDtMG YPDLIVGSFG 480 

15 VDKAWYRGR PIVSASASLT IPPAMFNPEE RSCSLEGNPV ACIMLSFCLM ASGKHVADSI S40 

GPTVELQUJW QKQKGGVRRA IiPLASRQATIi TQTI,LIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SIiDPQAPVDS KGLRPALHYQ SKSRIH3KAQ ILIiDCGEDNI CVPDLQLEVP . S60 

GBQNHVYLQD KHALNIiTFHA QNVGEGGAYB AEIiRVTAPPE AEYSGLVRHP GNFSSLSCOY 720 

FAVIIQSRU.V CDUaiPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NUINSQSDW 780 

20 SFIU.SVEAQA QV7I.HGVSKP EAVLFPVSDH HFRDQFQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVI.BIiSCP QALBGQQLI.y VTRVTGIitCT THHPINPRSL ELOPE6SLHR QQXREAPSRS 900 

SASSGPQII.K CPEAECFRLR CSLGPLHQQE SQSLQLHFRV HAKTFLQREH QPFSI«CEAV 9fi0 

YKALKMPYRI LPRQIiPQKBR QVATAVQWTK AEGSYGVPLW IIIIAIIiFGIi U.I.ai.liIYIIi 1020 
YKLGFPKRSI. PYGTAMEKAQ IiKPPATSDA 

25 

Seq ID NO: 388 DHA sequence 
Nucleic Acid Accession H: NM_002425 
Coding sequence! 26.. 1453 

30 1 11 21 31 41 51 

I I I I I I 

AAAGAAGGTA AGGGCAGTGA GAATGATGCA TCTTGCATTC CTTGTGCTGT TGTGTCTGCC 60 

AGTCTGCTCT GCCTATCCTC TGAGTGGOaC AGCAAAAGAQ GAGQACTCCA ACAAGGATCT 120 

TGCCCAGCAA TACCTAGAAA AGTACTACAA CCTCGAAAAQ GATOTGAAAC AGTTTAGAAG 180 

35 AAAGGACAGT AATCTCATTG TTAAAAAAAT CCAAGGAATG CAGAAGTTCC TTGGGTTGGA 240 

GGTGACAGGG AAGCTAQACA CTGACACTCT GGAGGTGATG OGCAAGCCCA GGTGTGGAGT 300 

TCCTGACGTT GGTCACTTCA GCTCCTTTCC TGGCATGOK AAGTGGAGGA AAACCCACCT 360 

TACATACAGG ATTGTGAATT ATACACCAGA TTTGCC3U«3A GATaCTGTTG ATTCTGCCAT 420 

TGAGAAAGCT CTGAAAGTCT GGGAAGAGGT GACTCCACTC ACATTCTCCA GGCTGTATGA 480 

40 AGGAGAGGCT GATATAATGA TCTCTTTCGC AGTTAAAGAA CATGGAGACT TTTACTCTTT 540 

TGATGGCCCA GQACACAQTT TGGCTCATGC CTACCXACCT GGACCTGGGC TTTATGGAOA 600 

TATTCACTTT GATGATGATG AAAAATGGAC AGAAGATGCA TCAGGCACXA ATTTATTCCT 660 

CGTTGCTGCT CATGAACTTG GCX»CTCCET GGGGCTCTTT CACTCAGCCA ACACTGAAGC 720 

TTTGATGTAC CXaCTCTACA ACTCATTCAC AGAGCTCGCC CAGTTCOQCC TTTCGCAAGA 780 

45 TGATGTGAAT GGCATTCAOT CTCTCTAOGG ACCTCXXXCT GCCTCTACTO AGGAACCCCT 840 

GGTGCCCACA AAATCTQTTC CTTCXSGGATC TGAGATOCCA GCCAAGTGTQ ATCCTGCTTT 900 

GTCCTTCGAT GCCATCAGCA CTCTGAGGGG AGAATATCTG TTCTTTAAAG ACAGATATTT 960 

TTGGCGAAGA TCCCACTGGA ACCCTGAACC TGAATTTCRT TTCATTTCTG CA TTTTGG CC 1020 

CTCTCTTCCA TCATATTTGG ATGCTGCATA TGAAGTTAAC AGCAGGGACA CCGTTTTTAT 1080 

50 TTTTAAAGGA AATGAGTTCT GGGCCaTCAG AGGAAATGAG GTACAAGCAG GTTATCCAAG 1140 

AGGCATCCAT ACCCTGGGTT TTCCTCCAAC CATAAGGAAA ATTGATGCAG CTGTTTCTGA 1200 

CAAGGAAAAO AAGAAAACAT ACTTCTTTGC AGCGGACAAA TACTGGASAT TTGATQAAAA 1360 

TAGCCSUSTCX: ATGGAGCSUUJ GCTTCTCTAa ACTAATASCT SATOACTTIC CAOaAGTTGA 1320 

GCCTAAGGTT GATGCTGTAT TACAGGCATT TGGATTTTTC TACTTCTTCA GTG6ATCATC 1380 

55 ACAGTTTQAO TTTGACCCCA ATGCCAGGAT GGTGACACAC ATATTAAAGA GTAACAGCTG 1440 

GTTACATTGC TAGGCXSAGAT AGGGGGAAGA CAGATATGGG TGTTTTTAAT AAATCTAATA IS 00 

ATTATTCATC TAATGTATTA TGAGCXAAAA TGGTTAATTT TTCCTGC3VTG TTCTGTGACT 1560 

GAAGAAGATG AGCCTTGCaG ATATCTGCAT GTOTCATGAA GAATGTTTCT GGAATTCTTC 1620 

ACTTGCTTTT GAATTGCACT GAACAGAATT AAGAAATACT CATGTGCAAT AGQTGAQAGA 1680 

60 ATGTATTTTC ATAGATGTGT TATTACTTCC TCAATAAAAA GTTTTATTTT GGGCCTGTTC 1740 
CTT 



1 11 21 31 41 SI 

I I I I t I 

MHLAFIiVUiC LPVCSAYPIiS GAAKEEDSNK DLAQCiXIiEXY VNLEIOJVKQF RRKDSNI<IVK 
KIQGMQKPLG LEVTGKU3TD TLEVMRKPRC GVPDVGHPSS FPGMPKHRKT HIiTYRIVNYT 

70 PDIiPRDAVDS AIEKALKVWE EVTPI.TPSRL YEGEADIMIS FAVKEHGDFY SFDGPGHSLA 
HAYPPOPGLY GDIHFDDDEK WTEDASGTHL FLVAAHELGE SLGIiFHSAUT BALMYPLYNS 
FTELAQFRIiS QODVNGIQSL YGPFPASTEE PLVPTKSVPS OSEMFAKCDP AI.SFDAISTL 
RGEYIiFPKDR XFWRRSHWNF EPBFHI.ISAF MPSLPSYIiDA AYEVMSROTV FIFKGHEFHA 
IROJEVQAGY PRGIHTLGPP PTIRKIDAAV SDKEKKKTYF FAAOKYHRFD ENSQSMEQGF 

75 PRLIADDFPG VEPKVDAVLQ AFGPPYPFSG SSQFEFDPHA RMVTHIIiKSN SWLHC 

Seq 10 NO: 390 OKA sequence 

Nucleic Acid Accession »> NM_002421.2 

Coding sequence: 1..1409 

1 11 21 31 41 51 

I I I I t I 

ATGCACAGCT TTCCTCCACT GCTGCTGCTG CTGTTCTGGQ QTGTGQTGTC ACACAOCTTC 
CCAGCGACTC TAGAAACACA AGAGCAAGAT GTGGACTTAG TCCAGAAATA CCTGGAAAAA 
85 TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC GGAGAAATAG TGGCCCAQTG 
CfTTGAAAAAT TGAAGCAAAT GCAGGAATTC TTTGGGCTGA AAGTGACTGG GAAAGCAGAT 
GCTGAAACCC TGAAGGTGAT GAAGCAGCCC AGATGTGGAG TGCCT6ATt3T GGCTCA6TTT 
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GTCCTCACTG AGGGGAACCC 



TGC3AGTAATG TCACACCTCT 
ATATCTTTTG TCAGGGGAGA 
CTTGCTCATG CTTTTCAACC 
GAAAGGTGGA CCAACAATTT 
GGCCATTCTC TTGGACTCTC 
ACCTTCAGTG GTQATGTTCA 
GGACGTTCCC AAAATCCTGT 
AAQCTAACCT TTGJVTGCTAT 
TTCTACATGC 
TGGCCACAAC 



CCCAAGGACA 
CTTTCTGAGG 
GATOAATATA 
GGAATTGGCC 
GGAACAAGAC 
AATAGCTGGT 



TGCCAAATG6 
AAGGGAATAA 
TCTAC3VGCTC 
AAAACACTGG 
AACGATCTAT 
ACAAAGTTGA 
AATACAAATT 
TCAACTGCAG 



TCGCTGQGAG CAAACACATC TGACXH-ACAG G 
AGCABATGTG OACCATGCCA TTGAGAAAGC C 
GACATTCACC AaOQTCTCTG 
TC31TCGGGAC AACTCTCCTT 
AGGCCCAGGT ATTGGAGGGG ATGCTCATTT 
CAGAGAGTAC AACTTACATC GTGTTGCGGC 
CCATTCTACT GATATCGGGG CTTTGATGTA 
GCTAGCTCAG GATGACATTG ATGGCATCCR 
CCAGCCCATC GGCCKACAAA CCCCAAAAGC 
AACTACGATT OQGGGAGAAQ TGATGTTCTT 
CrrCTACCCG GAAGTTGAGC TCAATTTCAT 
GCTTGAAGCT GCTTAC6AAT TTGCOSACAQ 
GTACTGGGCT GTTCAGGGAC AGAATGTGCT 
CTTTGGCTTC CCTAGAACTG TGAAGCSkTAT 
AAAAACCTAC TTCTTTGTTG CTAACAAATA 
GGATCCAGGT TATCCCAAAA TQATAGCACA 
TGCAGTTTTC ATGAAAGATG GATTTTTCTA 
TGATCCTAAA 
GAAAAATTAG 



PCT/US02/12476 



TGOAGOAAAT 540 

TGAT6AAGAT 600 

TCATGAACTC 660 

CCCTAGCTAC 720 

AGCCATATAT 780 

AT6TGACAGT 840 

TAAAGACAGA 900 



1020 
1080 
1140 
1200 
1260 
1320 
13S0 



AOATOAAQTC 
ACACG6ATAC . 
CGATGCTGCT 
CTGGAGGTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAGCT 



MHSFPPLIiLIj L 
VEKLKQMQEF f 
YTPDIiPRAOV E 
LAHAFQK3PO 
TFSCa}VQLAa 



ODIOGIQAIY 
EVELNFISVF 
PRTVKHIDAA 
MKDGPFYFPH 



f PATLETQEQD 
) AETUCVHKQP 
U HSNVTPLTFT 

GRSQNFVQPI 
HPQItPNGIiBA 
LSEENTGKTY 
GTRQYKFDPK 



31 
I 

VDLVQKYLEK 
RCGVPDVAQF 
KVSBGQADIM 
NLRRVAAHEL 
GPQTPKACDS 
AYBPADRDEV 
FFVANKYWRY 
TKRIIiTLOKA 



41 



51 



YYNLKNDGRQ V 
VLTEGNPRWE QTHLTYRIEN 
ISFVRGDHRD NSPFDGPGGN 
GHSLGLSHST DIGALMYPSV 
KIiTFDAITTI R6EVMFFKDR 
RFFXGNKVHA VQGQNVLKGY 
DEYKRSMDPG YPXMIAHDFP 
NSHENCSUQI 



Seq ID NO: 392 DNA a 
Huclelc Acid Accession tf: NM_002421.2 
Coding sequence: 1..1409 



ATQCACAGCT 
CCAGCGACTC 
TACTACAACC 
GTTGAAAAAT 



TTCCTCCACT GCTGCTGCTG 



GTCCTCACTG 
TACACGCCAG 
TGGAGTAATG 
ATATCTTTTG 
CTTGCTCATG 



TGAAGAATQA 
TGAAGCAAAT 
TGAAGGTGAT 
AGGGGAACCC 
ATTTGCCAAG 
TCACACCTCT 
TCAGGGGAGA 
CTTTTCAACC 
CCAACAATTT 



CTGTTCTGGG 
GTGGACTTAG 
6TTGAAAA6C 



I 



QAAGCAGCCC 
TCGCTGGQAG 
AGCAGATGTG 
GACATTCACC 
TCATCGGGAC 



CAGAGAGTAC 



AGATGTGGA8 
CAAACACATC 
GACCATOCCA 
AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 



TCCAGAAATA 
OSA6AAATAG 
AASTGACTGO 
TGCCTGATGT 
TGACCTACAG 
TTGAGAAAGC 
AGGGTCAAGC 



51 
I 

ACACAGCTTC 
CCTGOAAAAA 
TGGCCCAGTG 
GAAACCAGAT 
GGCTCAGTTT 
GATTGAAAAT 
CTTCCAACTC 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGCCCTC 



GGCCATTCTC TTQGACTCTC CCATTCTACT GATATCGGGG CTTTGATGTA CCCTAGCTAC 720 



AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
CGGTTTTTCA 

CmCTGAGQ 
GATGAATATA 
GGAATTGGCC 
OGAACAAQAC 
AATAGCTGGT 



GTQATGTTCA 
AAAATCCTGT 
TTGATGCTAT 
GCACAAATCC 
TGCCAAATGG 
AAGGGAATAA 
TCTACAGCTC 
AAAACACTGG 
AACGATCTAT 
ACAAAGTTGA 
AATACAAATT 
TCAACTGCAG 



GCTAGCTCAG GATGACATTG ATtSGCATCCA 
CCCCAAAAGC 
TGATGTTCTT 
TCAATTTCAT 
TTGCOGACAG 
AGAATGTGCT 
TGAAGCATAT 
CTAACAAATA 
TQATAGCACA 



AACTACGATT 
CTTCTACCCG 
GCTTGAAGCT 
GTACTGGGCT 
CTTTGGCTTC 
AAAAACCTAC 
GQATCCAGGT 
TQCAGTTTTC P 
TGATCCTAAA J) 



AGOCATATAT 
ATGTGACAGT 
TAAAGACAGA 



GCTTAOGAAT 
GTTCAGGGAC 
CCTAGAACTG 



AGATGAAGTC 
ACACGGATAC 
CGATGCTGCT 
CTGGAGGTAT 
TGACTTTCCT 



^ TTTTQACTCT CCAGAAAGCT 



1200 
1260 
1320 
1380 



1 11 
I I 
MHSFPPLLLL LFWGWSHSF 
VBKI.KQKQEF FGLKVTGKPD 
YTPDLPRADV DHAIEKAFQL 
LAHAFQPGPO IGGDAEFDED 
TPSGDVQLAQ DDIDGIQAIY 
PYMRTKPPYP EVELNFISVF 
PKCIYSSFGF PRTVKHIDAA 
GIGKKVDAVF MKDGFFYPPH 



21 



31 



I I 
PATLETQBQD TOLVOKYLEK 
AETLKVMKQP RCGVPDVAQF 
WSNVTPI.TFT KVSEGQADIM 
ERHTNNFRKY NIJIRVAAKAIi 
GRSQNPVOPI GPQTPKACDS 
HPQLPMGIjEA AYBFADROEV 
LSEraiTGtCTY FFVAMKyWRY 
GTRQYKFOFK TKRILiTi:.QKA 



41 



51 



I 

YYNIiKNDGRQ VEKRRNSGPV 
VLTEGMPRWE QTHLTYRIEN 
ISFVR(33HRD KSPFDGPGGN 
QISLGLSHST DIQALMYPSY 
KLTFDAITTI RGEVMFFKDH 
RFFKGNKYKA VQGQNVIAQY 
ORYKRSriDPG YPKMIAHDFP 



Seq ID NO: 394 DNA sequence 
Nucleic Acid Accession »: NM 
Coding sequence: 1..1S06 



331 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



wo 02/086443 

MGGTCAGAA AGCCIGTrGT GTCCACCATC TCCAAAGGAG 
AAOQ6QAG6C TGCCTTCCCT GGGCAACAAG GAGCCACCtG 
AAGAGGAAAQ TCACTTTACT aAGGOQAQTC TCCATTATCA 
GGAATCTTCA TCTCTCCTAA GGGCXSTGCTC CAGAACACGQ 
ACCaTCTGGA CGGTGTGTGG GGTCCTGTCA CTATTTGGRG 
GGAACAACTA TAAAGAAATC TGGAGGTCAT TACACATATA 
TTACCAGCTT TTGTACGAC3T CTGGGTCGAA CTCCTCATAA 
GTGATATCCC TGGCATTTGG ACGCTACATT CTGGAACXSVT 
CCTGAACTTG CGATCAAGCT CATTACAGCT GTGGGCATAA 
AGCATGAOTQ TCAGCTGGAG OGCCCGGATC CAGATTTTCT 
OCAATTCTOA TAATTATAGT OCCTGaAOTT ATGCAQCTAA 
TTTAAAGACG OSTTTTCftGO AASAGATTCA AGTATTAaGC 
TATGGAATGT ATGCATATGC TGGCTGGTTT TACCTCAACT 
AACCCTQAAA AAACCATTCC CCTTGCAATA TGTATATCCA 
TATGTGCTGA CAAATGTGGC CTACTTTACQ ACCATTAATG 
AATGCAGTGG CAGTGACCTT TTCTGAGCGG CTACTGGGAA 
ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAACGGTG 
TTATTCTATO TTG0GTCTCX3 AGAGGGTCAC CTTCCAGAAA 
OGCAAGCACA CTCCTCTACC AQCTGTTATT GTTTTGCACC 
TTCTCIGGAG ACCTOOACRG TCTTTTGAAT TTCCTCAGTT 
GGGdGGCAG TTGCTGGGCT GATTTATCTT CXUVTACAAAT 
TTCAAGGTGC CACTGTTCAT CCCAGCTTTG TTTTCCTTCA 
CTTTCCCTCT ATTCGGACCC ATTTAGTACA GGGATTGGCT 
GTCCCTGCGT ATTATCTCTT TATTATATGG GACAAGAAAC 
TCAGAGAAAA TAACCAGAAC ATTACAAATA ATACTGGAAG 
TTATGAACTA ATGGACTTGA GATCTTGGCA ATCTGCCCAA 
TTTTTACTTC ATTTTCTaAA AGTCTAGAGA ATTACAACTT 
CAOTTATTTT TATICaTATA TTTTAGCATA TTOJAACTAA 
AACTCTATQT AGTTATAGAA AGTGAATATQ CAGTTATTCT 
GTCTCTGATA CCTACXTTATT GOOGrrAQGA GAAAAOACTA 
TCTCTACAAC ATATGTTAGC AOGGCAAAGA ACCTTCAAAT 
TATATATGGG TTTTGTAAAG ATGGTTTXAC ACACTAC3«5A 
TTTTCAATTC TGAAAAAAAG CATACATCAT QATTATGGCA 
ATTTTACATT GACATTGCAT TGCTTCCCCT TAGATACCAA 
GCTTTAATGG ATTATACCCA GAGCACTTTG AACAAAGGTC 
TTAAAGAAGA GTTTCTAGGG GCTACTGTTT ATGAGACACA 
AAAAATCCTT GAGRATTTAT TATGTCAGAT GTTTTTTCAT 
TTATCTGTCA TTTTTTTTTT TCACATCAGT TTGATCAGGA 
AQCAASAGTT AGTTTGGTAT TAAATCXTTCA TTAGAACSiAC 
TACCCCTGAT GAGTCTATCT AAACATAIGC ATTTTAAGCC 
TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAO 
CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTA AATTT A 
TQAATTTTGA CAGTTTGTGC ATTTTCTTTA TACATTTTAT 
CTTCAGATGA AACTGTCCAG ATTAATTAGQ AAAAOGCATA 
AAGAAATGTC GCTGTAAATA AGATTTACAA CrGATGTTTC 
ATCTAGGCTT TGTCAGTAAT TTCCACACCT TAATTATCAT 
CTGATAAGAA GAAAATTGAA ATGAGAATCT GTGGATAAGT 
GTTTTGCCAG TATTAGAAAA TACTGTGAGC CGGGCATGGT 
GCACTTTGGG AGGCTGAGGQ GGTGGATCAC CTGAGGTCGG 
CAACATGGAG AAACCCCATC TCTACTAAAA ATAC3VAAATT 
GdGGTAATC TCAGCTATIG AGOAGGCTGA GGCAGGAGAA 
GA6GTTGCAO TOAGOCAAGA TTGCACCACT GTACTCCAOC 
CCATCTCCAA AAAAAAAAAA AAAA 



PCT/US02/12476 



CTTTGTCTTA 
TTTTGGAAGT 
TACGCCCTGC 
TTTTTATTCA 
CTGTAGTGAT 



GTTACCTGCA GGGAAATGTT 
GGCAGGAGAA AGTGCAGCTG 
TTGGCACCAT CATTGGAGCA 
CATGTCTCTQ 
TCCTQAATTG 
CTTTGGTCCA 
AGCTACTGCr 
ATGTGAAATC 
GGTCCTAAAT 
CAAGCTCACA 
AACGCAGAAC 
GGCTTTTTAT 
AGAAGTAGAA 
CACCATTGGC 
GCTGCTTTCA 
AGCAGTTCtXJ 
TGTCTCCAGG 
GATTCATGTC 
GATAATGCTC 



TTAAAGGTCA 



TTGTTACTGA 
TGGCCATTGT 
CTGAGGAGCT 
ATTTCTCATT 
GTGTGTTTGC 
TCXrrCTCCAT 
CTTTGACAAT 



CATQCCTCTT 
TCGTCATCAC 
CCAGGTGGTT 
TTGTACCAGA 
GGGGAGACAC 
TGGTGATAAA 
TTrCTAAGAA 
ATGAGICGCa. 
GACAATTACT 
TGAAGACTGA 
TGTCTATACT 
AAGAGGAGAG 
TTTAGATAAC 
AGTGGGGATT 
TCCAGGAGTT 
TCATTATCAG 
AAGTGTATAA 
CACCTOTTTC 
TTCAAATTAC 
TCCCRT ATCT 
TQGCTATTTT 



GCATCGTCCT 
CIATQOTTOCC ' 
TCIGACTGGA 
TAGAATAATG 



TCAACTTGCA 
QTTTGTGTTC 
GGCTTACATC 
GAGTTCTAGA 
AGCTGGGCAT 
TTGCTTGAAC 
CTQQOTGACA 



AAAATAGGGA 
CAAAAGGAGT 
ATTTAGTTAT 
C3VATTCTTGA 
ATGTGGTCAT 
GATTTTTCTG 
GTGAAAAGTG 
AAAOAAATTC 
AAACACTCAT 
GTTGAATACA 
ATGTTTAAGT 
GAAGTTTTAG 
CACATCTTAG 
ACTAATAACT 
ATTATCAACA 
GTAATCATAT 
TACACGATGA 
XAAAATATCT 
AAAATTGCAA 
CCACTTCTAT 
AAAGAGACAA 
AGAAGATGTT 
TCTAATCCCA 
CC3VGCCTGAC 
GGTGGCACAT 
CCGGGAGGGQ 
AAGTCAGACT 



21 



31 



51 



MVRKPWSTI 
OIFISPKSVIi 
LPAFVRVHVE 
SMSVSWSARI 
YGMYAYAGWP 
NAVAVTFSEK 
RKHTPLPAVI 
FKVPIjFIPAL 
SEKITRTLQI 



Seq ID NO: 396 DNA sequence 
Nucleic Acid Accession »» MM 
Coding sequence s 57 . . 764 



GCCGCCAGOO GCTTTCTOGG t 



RCCCCGCTCG 
GOQATGCTGC 
ACGGRCCCTQ 
GCCAGTTCCT 



TGGACGACCA 
GTGAAAAATT 
AAGCTACTTG 
AAGATGAGGG 
CCTGTGATGC 



COGGGCOCTA CTTCTCOSTT ACTACTACQA 
6T3KCGGGGGC TGCGAGGQCA ACGCCAACAA 
TTGCTGOAGG ATAGAAAAAQ TTCCCAAAGT 
GTSTGAGGGG TCCftCAGAAA AGTATTTCTT 
CTTTTCXX5GT GGGTGTCACC GGAACOSGAT 
TATGGaCTTC TGCX3CACCAA AGAAAATTCC 
ACTOTGCTCT GCCRATGTGA CTCGCTATTA 
TTTCACCTAT ACTGGCTGTG GAGGGAATGA 



: CAGOGGGCOG COCGACCCCX: 
C TGCIGCTTTT CCTGACGGAG 
CTGTCTCCTG 
CAGQTACACG 
TTTCTACACC 
TTGCCGGCTG 
TAATCTAAQT 
TGAGAACAGG 
ATCATTTTGC 
TTTTAATCCA 
CAATAACTTT 



1080 
1140 
1200 



iseo 

ISBO 
1740 



1920 
1980 
2040 
2100 
2160 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



I I I I I 

smcrajawv hgrlpslghk bppqqekvqii kricvtluigv siiigtiiga 

QMTGSVaMSIi T1WTVCS7LS IiFBAIiSYABI. GTTIKXSQGR VTyiLEVPaP 
lAIIRPAATA VISLAFORYT LBPFPIQCBI PBLAIKLITA VSITWMWLM 
QIFLTPCKLT AILIIIVFOV KQIiIKaQTON FXDAF80SDS SITRIJPLAFY 
YIMFVTBEVE NPEKTIPLAI CISMAITIGV YVLTNVAYFT TIHAEEUiLS 
UiGNFSIAVP IFVALSCFGS KNGOVFAVSK LFYVASREGK U?BII.SMIBV 
VI.HPLTT4IMI. FSGDLDSLUf FLSPARHLFI GLAVAGtilYL RYXCPDMHRP 
FSFTCajFMVA LSLYSDPFST GlGFVITl.Ta VPAYYIiFZIH DKKPHWFRIM 
ILEWPEEDK L 



51 

I 

TGCACCATGG 
6CTGCACTGG 
CCCCTAGACT 
CAOAGCTGCC 
TGGGAGGCTT 
CAAGTGAGTG 
TCCATGACAT 
TTTCCAGATG 
TACAGTCCAA 
AOATACAGAA 
GTTAGCAGGG 



332 
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AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATQ CCAAAGCTTC 720 

GCTTTGCCAG TAGARTCXXSG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAR.GAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAATAC AAGTCACTTT 900 

TTCAAAAATT TGGATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT 960 

TTTAATTTAT GOTTCAACTQ TTTOTQAGAC GAATTC3TOC AATGCATAAG ATATAAAAGC 1020 

AAATATOACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAQAAGAGa ATCMAACTQ 1080 

AAACAACATA AGACAATATA ATCATGTGCT TTTAACATAT TTGAOAATAA AAAOGACTAO 1140 
CC 



1 11 21 31 41 51 

I I I I I I 

MDPARPLGLS ILLLFLTEAA LGDAAQEPT6 NNAEICLLPL DYGPCRALUi RYVYDRYTQS 60 

CRQFIiYGQCE GMAiJHPVTWE ACDDACWHIB KVPKVCSLgv SVI»QCBSST EXXFEMIiSSM 120 

TCEKFFSGGC HBNHIENRPP DEATCMGPCA PKKIPSFCYS PXDEOLCSAN VTRYYFNPRY 180 
RTCDAFTYTG CGOniNNFVS REDCKRACAK ALKKKKKMPK tiRFASRIRKI RKKQF 

Seq ID NO! 398 DNA sequence 

Nucleic Acid Accession Ss NM_001S08.1 

Coding sequence: 1..1361 

1 11 21 31 41 SI 

I I I I I I 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TGCTCCCAAA TCATTOATCA CAGTCATOTC 60 

CCCX3AGTTTG AGGTGGCCAC CTGGATCAAA ATCACCCTTA 'TTCTSGTGTA CCTGATCATC 120 

TTCGTGATGG GCCTTCTGGG GAACAGCQTC ACCATTCX3GG TCACCCAGGT GCTGCAGAAG 180 

AAAGGATACT TGCAGAAGGA GGTGACAGAC CACATGGTCA GTTTGGCTTa CTCGGACATC 240 

TTGQTGTTCC TCATCGGCAT GCCCATGGAG TTCTACAGCA TCATCTGGAA TCCCCTGACC 300 

ACGTCCAGCT ACACCCTGTC CTGCAAGCTa CACACTTTCC TCTTOGAGGG CTGCAGCTAC 360 

OCTACQCTQC lOCAOaTQCT GAOSCTCAiGC TTTGAQOaCT ACATCQGCAX CTGT CACCCC 420 

TTCAGGTACA ACOCTSTOTC GQGRCCTTGC CAGCTGAAGC TGCTSATTGG CTTOGTCTGQ 480 

GTCACCTCCX; COCTGGTGGC ACTGCCCTCG CTGmGCCA TGGGTACTOA OTACXrCCCTG 540 

GTGAAC3GTGC OCAGCCACCG GGGTCTCACT TGCAACOGCT CCAGCavCCOQ CCACCaCQAQ 600 

CAGCCCQAQA CCTCCAATAT GTCCATCTGT ACCAA(XTCT CCAGCXS3CTQ GACCX3TGTTC 660 

CaOTCCAGCA TCTTCGGCGC CTTCGTCGTC TACCTCGTGG TCCTGCTCTC CGTAGCCTrC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCATG AAAAGCC3W3A AGGGCTCGCT GGCCGGGGGC 780 

ACGOGGCCTC CX3CAGCTGAG GAAGTCOGAG AGCXiAAGAGA GCAGOACCGC CAGGAGGCAG 840 

ACCATCATCT TCCTGAGGCT GATTGTTGTG ACATTGGCCQ TATGCTGGAT GCCCAACCAG 900 

ATTCGOAGGA TCATGaCTGC GGCCAAACCC AAGCACGACT OGACSAGGTC CTACTTCXXSQ 960 

GOGTAtaVTOA TCCtCCTCCC CTTCTCGGAG ACGTTTTTCT ACCTCABCIC GCICATCAAC 1020 

CCGCTCCTGT ACAC3GTGTC CTCGCAfiCAG TTTOGGOSGQ TGTTCGTGCA GGTGCTGTGC 1080 

TGCCGCCTGT CGCTGCAGCA 0GCCAA(3CAC GAGAAGCGCC TGCGCGTACA TGCGCACTCX: 1140 

ACCACOGACA GOQCOaSCTT TGTGCAGC6C COGTTCCTCT TOGCGTCCCG GC0CX3W3TCC 1200 
TCTGCRAGGA GAACTGAGAA QATTTTCTTA AGCACTrTTC AOAGOGAGGC OGAGCCCCAO 



I I I I I I 

MASPSLPGSD CSQIIDHSaV PBFEVATWIK ITIiILVYLII FVM6LU3HSV TIRVTQVLQK 
KGYLQKEVTD HMVSIACSDI LVFLIOflPME FYSIIIOIPIiT TSSYTIiSCKIi HTFIiFEACSY 
ATLLHVLTLS FERYIAICHP FRYKAVSGPC QVKLLIGFVH VTSALVALPIi LFAMOTEYPIi 
VNyPSHRGLT CNRSSTRHHB QPETSNMSIC TNLSSHMTVF QSSIFGAFW YLWUiSVAF 
MCWMMMQVUI KSQKGSLAGG TRFPQLRKSB SEBSSTARBQ TIIPLRIiIW TIAVCWMPHQ 
ft AYMII.I.PFSB TPFYIiSSVIN PLLYTYSSOQ FRRVFVQVLC 



8K3QSLSLES LEPNSQAKPA NSAAENQFQB HEV 

Seq 10 NO: 400 ONA Bequeitce 
Nucleic Acid Accession S: NM_006475.1 
: 28.. 2538 



TTOCTGCTTA TTOTTAACCC TATAAACX3CC AACAATCATT ATGACAAGAT CTTGGCTCAT 
AGTCGTATCA GGGGTCX3GGA CCAAGGCCCA AATGTCTGTO CCCTTCAACA GATrrTOGGC 
ACCAAAAAOA AATACTTCAQ CRCTTGXAAO AACTGGTATA AAAAGTCCWT CTGTGGACAQ 
AAAA06ACTO TTTTATATQA ATQTTQCOCT QSTtKCKIGA GAAXGSftAGG AAXGhAASGC 
T6CCCAGCAG TTTTGCCCAT TCAOCATGTT TATGGCACTC TGGGCATCOT OSGAGCCACC 
ACAAOGCAGC GCTATTCTGA CGCCTCAAAA CTGAGGGAGG AGATCGAGG6 AAAGGGATCC 
TTCACTTACT TTGCACCGAQ TAATGAGGCT TGGGACAACT TGGATTCTGA TATCCGTAGA 
GGT-rrGGAGA GCAACGTGAA TGTTGAATTA CTGAATGCTT TACATA3TCA CATGATTAAT 
AAOAQAATGr TGACCAAGGA CTTAAAAAAT GGCATGATTA TTCCTTCAAT GTATAACAAT 
TTGGGGCTTT TCATTAACCA TTATCCTAAT GGGGTTOTCA CTGTTAATTG TGCTCGAATC 
ATCCRTGGGA ACCAGATTGC AACAAATGOT GTTGTCXXTG TCATTGACCG TGTGCTTACA 
CAAATTGGTA CCTCAA.TTCA AGACTTCATT GAAGCAGAAO ATGACCTTTC ATCnTTAQA 
GCAGCTGCCA TCACATC3GGA CATATTGOAG GCCCTTGGAA QAQACGGTCA CTTCACACTC 
TTTGCTCCCA CCAAT6AGGC TTTXGAGAAA CTTCCACQAS OrGTCCTASA AAGGTTCATG 
GOAOACAAAG TGGCTTCOBA AGCXCTTATQ AACTACCACA TCTTAAATAC TCTCCAGTOT 
TCTOAOTCTA TTATGGGAea AGCAGTCTTT GAGAOGCIGG AAGGAAATAC AATTGAGATA 



333 
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GGATGTG^CG GT6ACAGTAT AACAGTAAAT GCAATCAAAA TGGTGftACAA AAAGGiRTATT 1080 

GTGACAAATA ATGGTGTGAT CKATTTGATT GATCAGGTCC TAATTCCTGA TTCTGCCAAA 1140 

a«GTTATT6 AGCTG6CTGG AAAACAGCAA ACCAOCTTCA OGGATCTTGT GGCCCAATTA 1200 

GGCTTGGCAT CTGCTCTGAG GCCAGATGGA OAATACACTT TOCTGGCACC TOTQAATAAT 1260 

GCATTTTCTG ATGATACTCT CAGCATGGTT CAGCX3CCTCC TTAAATTAAT TCTGCAGRAT 1320 

CACATATTGA AAGTAAAAGT TGGCCTTAAT GAGCTTTACA ACGGGCAAAT ACTGGAAACC 1380 

ATCQGAG6CA AACAGCTCAG AGTCTTCX3TA TATCGTACAG CTGTCrGCAT TQAAAATTCA 1440 

TGCATGGAGA AAGGGAGTAA GCAAGGGAGA AACGGTGCSA TTCACATATT COGCGAGATC ISOO 

ATCAAGCCAG CAGAGAAATC CCTCCATGAA AAGTTAAAAC AAGATAAGCQ CTTTAGCACC 1560 

TTCCTCAGCC TACTTGAAQC TGCAGACTTG AAAGAGCTCC TGACACAACC TGGAGACTGG 1620 

ACATTATTTG TGCOUWXAA TQATaCTTTT AAOGGAATGA CTAGTGAAGA AAAAGAAATT 1680 

CIGATAOQGG ACAAAAATGC TCTTC3VAAAC ATCATTCTTT ATCACCTGAC AGCAGQAGTT 1740 

TTCATTGGAA AAGGATTTGA ACCTGGTGTT ACTAACATTT TAAAGACCAC AC3VAGGAAGC 1800 

AAAATCTTTC TGAAAGAAGT AAATQATACA CTTCTGGTGA ATGAATTGAA ATCAAAAGAA 1860 

TCTGACATCA TGACAACAAA TGGTGTAATT CATGTTGTAG ATAAACTCCT CTATCCAGCA 1920 

QACACACCTG TTGGAAATGA TCAACTGCTG GAAATACTTA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT TTGTTCGTGG TAGCACCTTC AAAGAAATCC CCGTGACTGT CTATACaACT 2040 

AAAATTATAA CCAAACTTGT GGAACCAAAA ATTAAAGTGA TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCftAAA CTOAAGOACX: CACACTAACA AAAGTCAAAA TTGAAGGTGA ACCTGAATTC 2160 

AGACTGATTA AAGAAGGTGA AACaiATAACT GAAGTGATCC ATQGAGAGCC AATTATTAAA 2220 

AAATACACCA AAATCATTGA TGGAGTGCXn: GTGGAAATAA CTGAAAAAGA GACACGA6AA 2280 

OAAtSSAATCA TTACAGGTCC TGAAATAAAA TACACTAOQA TTTCTACTGG A6GTGGAQAA 2340 

ACAGAAGAAA CTCTGAAGAA ATTGTTACAA GAAGAGGTCA CCAAGGTCAC CAAATTCATT 2400 

GAAGGTGGTG ATGGTCATTT ATTTGAAGAT GAAGAAATTA AAAGACTGCT TCAGGGAGAC 2460 

ACACCCGTGA GGAAGTTGCA AGCXaUVCAAA AAAGTTCAAG GTTCTAGAAG ACGATTAAGG 2520 

GAAGGTCGTI CTCAGTGAAA ATCX»AAAAC CAGAAAAAAA TGTTTATACA ACCCTAAGTC 2580 

AATAACXrrOA CCTTAGAAAA TTGTGAGAGC CAAGTTOACT TCAGGAACTG AAACATCAGC 2640 

ACAAAGAAGC AATCATCAAA TAATTCTGAA CACAAATTTA ATATTTTTTT TTCTQAATGA 2700 

GAAACATOAO GGAAATTOTQ GAGTTA6CCT CCTOTGCTAA AGSA ATTG AA GAAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTCACATTA AAAGTTCTGQ CTAACTTTGG AATCCATTAG 2820 

AGAAAAATCC TTGTC3VCCAG ATTCATTACA ATTCAAATOQ AAGAGTTGTG AACTGTTATC 2880 

CCATTGAAAA QACCGAGCCT TGTATGTATG TTATGGATAC ATAAAATGCA CGCAAGCCAT 2940 

TATCrCTCXaV TGGGAAGCTA AGTTATAAAA ATAGGTGCTT GGTGTACAAA ACTTTTTATA 3000 

TCAAAAGGCT TTGCACATTT CTATATGAGT GGGTTTACTG GTAAATTATG TTATTTTTTA 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT Ca.TATGCTTC TTGCAATGCA TATTTTTTAA 3120 

XCTCAAACQT TTCAATAAAA CCATTTTTCA GATATAAAGA GAATTACTTC AAATTGACTA 3180 
ATTCAGAAAA ACTCAAGATT TAAGITAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID NOi 401 Protein aeguence 
Protein Accesaion #: IIP_006466.1 

1 11 21 31 41 SI 

I I I i I I 

MIPF1J>MFSL LlilililVNPIN ANNHYDKILA HSRIRGRDQG PNVCAIiQQIL GTKKiaFSTC 60 

KNWYKKSICG QKTTVLYECC PGVMRMBGMK GCPAVLPIDH VYGTLGIVGA TTTQRVSDAS 120 

KLREEIEGKG SFTYFAPSNE AKDNLDSDIR RGLESNVNVE LLNALHSHMI NKRMLTKDLK 180 

NGHIIPSMXN HLGLFINHYP NGWTVNCS^ lIHGNQlATN GWHVIDRVL TQIGTSIQDF 240 

lEAEDDLSSF RAAAITSDIIi EALGRDGHFT LFAPTNKAFE KliPRGVLERF MGDKVASEAL 300 

MKYHILNTLQ CSESIMGGAV FBTLEGWTIE IGCDGDSITV NGIKMVNKKD IVTNNGVIHIi 360 

IDQVLIPDSA KQVIBLAGKQ QTTFTDIjVAQ LGIASAUIPD GEYTLIAPVM NAFSDDTLSM 420 

VQRLLKLILQ NHIUCVKVGL NEiniGQIIiE TIGGKQLRVF VYHrAVClEM SCMEKGSKQG 480 

RNGAIHIFRE IIKPAEKSIjH BKUCQDKRFS TFbSUiEAAD LKBIiTQPCT WTI*VPTODA 540 

FKGMTSEEKB ILIRDKNALQ NIHiYHIiTK} VFIGKBFBPO VTMILKrtQO SKIPLKBVND 600 

TLLVNELKSK ESDIMTTNGV IHWDKIiIiYP ADTFVGNDQIi LEILNldjIlOf IQIKFVRGST 660 

FKEIPVTVYT TKIITKWEP KIKVIEGSLQ PIIKTEQPTL TKVKIBGEPE FRLIKEGETI 720 

TEVIHGEPII KKYTXIIDGV PVEITEKETR EERIITQPBI RYTRISTGGG ETEETLKKLL 780 
QEEVTKVTKP IEGGDGHI.FE DEEIKRiLQG DTPVRKU}AN KKVOGSRHIOi RBGRSQ 

Seq ID NOs 402 DNA sequence 
Hucleic Acid Accession 8 s I1H_002416 
Coding sequence i 40 . . 417 

1 11 21 31 41 51 

I I I I I I 

ATCCAATACA GGAGTGACTT GGAACTCCAT TCTATCACTA TGAAGAAAAG TGGTGTTCTT 60 

TTCCTCTTGG GC3VTCATCTT QCTGGTTCTa ATTGGAGTGC AAGGAACCCC AGTAGTGAGA 120 

AAGGGTCGCT GTTCCTGCAT CAGCACCAAC CAAGGGACTA TCXACCTACA ATCCTTGAAA 180 

OACCTTAAAC AATTTGCCCC AAGCCCTTCC TGCGAQAAAA ITGAAATCAT TGCTACACTG 240 

AAGAATGGAG TTCAAACATG TCTAAACCCA GATTCAOCAa ATGIGAAOOA ACTGATTAAA 300 

AAGIGOOAOA AACAGGTCAG CXJUUUU3AAA AAGCAAAAGA ATGGGAAAAA ACATCAAAAA 360 

AAGAAAOTTC TOAAAGTTOa AAAATCTCAA OOTTCTCGTC AAAAGAAGAC TACATAAGRG 420 

ACCACTTCAC CAATAA6TAT TCTGTGTTAA AAATGTTCTA TTTTAATTAT ACOGCTATCA 480 

TTCCAAAGGA GGATGGCATA TAATACAAAG GCTTATTAAT TTGACTAGAA AATTTAAAAC 540 

ATTACTCTGA AATTGTAACT AAAOTTAGAA AGTTGATTTT AAGAATCCAA ACGTTAAGAA 600 

TTGTTAAAGG CTATGATTGT CTTTGTTCTT CTACCACCCA CCAGTTQAAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAGAT GTTCACCCAA CCACaTCCCA 720 

CTCACaWWaG CTGCCTGQAA OAGCAQCCCT AGGCTTCCAC GTACTGCAGC CTCCAGAGAG 780 

TATCTQAGGC ACATGTCAGC AAOTCCTAAG CCTGTTAaCA TGCrGGTOAG CCAAGCAeiT 840 

TGAAATTGAG CTGGACCTCA CCAAOCXGCT GTGGCCATCA ACCTCTGTAT TTGAATCAGC 900 

CTACAGGCCT CACACACAAT OIGTCrQAGA GATTC AjaC T GAT TGTTA TT QGGTATCACC 960 

ACTGGAGATC ACXaGTGTGT GGCTTTOVQA GCCTCCTTTC TGGCTTTGGA AGCCATGIGA 1020 

TTCCATCTTG CCCGCTCAGG CTGACXa^CTT TATTTCTTTT TGTTCCCCTT TGCTTCATTC 1080 

AAGTCAGCTC TTCTCCATCC TACCACRATG CAGTGCCTTT CTTCTCTCCA GTGCACCTGT 1140 

CaTATGCTCT GATTTATCTC AGTCAACTCC TTTCTCaTCT TGTCCCCAAC ACCCCaCaGA 1200 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCAGCT TAGTTCAAGT CCTGCCTCTT 1260 

AAATAAACCT TTTTGGACAC ACAAATTATC TTAAAACTCC TGTTTCACTT GGTTCAGTAC 1320 

CACATGGGTG AACACTCAAT GGTTAACTAA TTCTTGGGTG TTTATCCTAT CTCTCCAACC 1380 
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AOATTGrCAG CTCCTTOAGG 
CTAATAATAC 
TG6C»ACCAO ACCATTGTCT 
CTAGCCTCTG GTAACCTCTT 
GATGCAACAT CCTTGTCTTT 
GCACGTGGTA AAACACTTGC 
AAAATCATAT AATCTTACAA 
OMACCATAC AAAAATTCXJT 
TCTAAGATCT AACAAGATAG 
AGTTTTATTG TCCGTTTACT 
1CICCCKIGH. AGAAAG6GAA 
TAOIGGAAGC ATGATTGGT6 
GGAGGTTCAG TGAATTQTGT 
CTTTCCCAAA TTGAATCACT 
TCCCACCCGA ACX^TCTTATC 
AAAAATCTAA GTGTTTCATA 
GTAGACAGTA TATAACXAAC 
TCATTTATCA TATATATACA 
AAACAGTATT GACTTQTATA 
TATCAATAAA TAGACCATTA 



TGTGGAACTA G6TTTTAATA ATTTTTTA* 



ACTTATTATC 
TTATGACAGG 

TGAAAAGGAC 
TTTCCOGAAG 
CCACCGAGAT 
TGTTTCAGAG 
CGGTGAAGTA 
CCCAGTTAGC 
AGGAGAGGTT 
GCTCACACTG 
TAATCATGAA 
AATTTGAQAG 
AACCAAAGAC 
TACATGCATA 
CCTTGTAATT 
ATCAG 



TTCRGGACAC 



GACTGTTTTT 
TTTATAGATC 
GAAAAGGGCT 
CCTTATCGAA 
TTTGTATTGT 



CCCTGTTTCT 
TGATGTTGTT 
TCCTGGCTAC 
TCACTACAGG 
AGCTTCTCCA 
AAAAAATATA 
AGCCAGTGAC 
TTCTCAATAA 
ACTCATTTTA 
GATTATCAAT 



TCCACAGTGC 
ATGGGCAGGA 
TCCATGTTGG 
GACCAGGGAT 
ACAATAAGAA 
CAGTTTACXX; 

CAACxrrTrrc 



PCT/US02/12476 



GGCAAATATG 



CTCTGCAGGA 



CTGATGATTT 
ACTCCCTAGT 
TCTGTGACCC 



TGTGGAAACC 
AGAATTTAAA 
AGAGTGCTGT 
TCCTTCATGT 
ACTTACCTTC 
CACTGACACaV 
GCAAATAATT 
TCTTTGTTAA 



CCAAGTCGGT 
TCCTTCCAGG 
CCTATACTC» 
CCGGTGGAGA 
AACTTCCCTO 
CATCTCAC3«3 
CACGTTATAA 
TTTCACTTCA 
AATAGAATGG 



1800 
1860 
1920 
1980 



2160 
2220 
22B0 
2340 
2400 
2460 



I I I I I I 

MKKSGVIiPl,!. GIILLVLIGV QGTPWRKGR CSCISTNQGT IHMJSLKDUC QFAPSPSCEK 
lEIIATLKNG VQTCLNPDSA DVKELIKKWE KQVSQKKKQK NGKKHQKKKV UCVRKSQRSR 



Seq ID NOi 404 ONA sequence 

Nucleic Acid Accession #: NK_006670 

coding sequence: S5..1347 



21 

i 

CCAGCCTCCC 
CGCGATGCCT 
GCGACTAGCG 
CTCCTTCTCC 
GQACCAGTGC 
CCGCAATCTG 



AGCTCOGGGG 
GAC6GGCGTC 
TCrCXX»CCT 



AAACGC6AGC 
TGOGGCTGGC 
CCTCGGCATC 
CCCCQCTGCC 
AGIGCSTTAA 
TCTTCCTTAC 



CCACTGGCCG 
AGTCCCCTTG 
CGGAGCTTCG 
CGCCGCTTGG 



TCCTTCCGCA 
CTTCACAATG 
AACAATCCCT 



AGCTGGCCAG 
TCAGGCACCT 
ACCTGACACA 



GCATCTGCCC 
CTTCGCTTTC 
CXTTGAACCAC 
GGTGGCGGCC 
CAACCACTTC 
GGACTTAAGT 
TCTAGAAAGC 



GTCCTCTTGG 
CAAACCTCTT 
GTTTTGTATT 
AGGGATCACA 
AACCTCAQTT 
CATGAGATGT 
TAOATACAAC 



AGGGCAAASA 
AACTCAACAG 
ATGTCTTCCT 
TGAACCGCaVA 
TGGAAQGGTA 
CTAACTCGGA- 
AGACTTAAGC 



TTCTTTTTCT 



ACAGATAGCA 
TATCAGTTTT 
CT6CAGAOGT 
AGAGCATGCT 
TTCTTTGACA 
TWTAATAAA 



TGTTCTQTTA 
TGGAACTCCT 
GCTGTCTGTC 
TTCAACAAAA 
ATTCTCATGT 
TAGCAGGCTC 



AA6TAAATTA 



TGCTGACCTG 
GGGTATTGTT 
GGGGATAAAA 
TCATTACAGA 
TGTCTGAGAA 
TTTATCCCTA 
TAAAAQCAGT 
ATGTAA6ACG 
CAACACGTAT 
TCTCTCTCAG 
GCTGCCTCAA 
ACCTAAGTTG 
TTCAAAATAA 
TGTTCTGCAT 
CTTTTTTGAT 



TCCTCGGCGC 
CCCGCGCTGT 
ACCOAGGTGC 
CTGGCOGTGC 
CTCAACCTCA 
A6CCTG0GCC 
TCGGGCAGCA 
ATCGTGCCCC 
CTGCTGGCGG 
CTTTACCTGC 
AATAATTOGC 
dCCACCTGG 
GGTCTACCCC 
GCAOACATGG 
TGTGCATATC 
GACTGTGACC 
TTAGCCCTGA 



3 AGOSGGOGCC GTCCCAGCCC 
r CCCGGGGCCC CGCCGCCGGG 
TGC3GCTGGGT CTCCTCGTCT 
CGTTCCTGGC TTCCGCCGTG . 
GCGAOTGCTC CGAGGCAGCG 
CCACQGACCT GCCCGCCTAC 



AGCTCX3ACCT 
ATGCCAGCGT 
CTGAAGAXGA 
GCCGTGCACT 
CGOGGGATGT 
TGGTGAGCCT 
AGGACAATGC 



CGOAAAAAAT 
CGATTCTTCC 
TAGGCGCTAT 
ATAACATCAG 



GCGGCAGAAC 
GCAGGGGCTC 
GCTGGCCCAA 
GACCTACGTG 
CCTCAAGGTC 

rrrcciGGAC 

CAAGGAAACA 



TATGAAATCA 
ATATTAGAGG 
CTAGGCTTGC 
GAAGGGGATT 
AT6AACAGTT 
GGAGGGATTT 
TACSVGTTCAA 



TGGAGAAAA1 



ATTACAAAAA 
TGCAGTTTAT 
CTGAATTGTT 



ACAGACCAAG 
TCCACTTTCA 
TGCTTCCTTG 
GTGTATAGTG 
TTCAGGTTTC 
GGTGTAGCAA 
AAAAATACTT 
AATTGCATCC 
CACAGGAGCA 
ATAACTTQCA 
ATGAAAAT6T 



CECATCCCTG 
TTTCCTCCTG 
AGATGCCTGC 
CAGATTAACA 
GACAACTCTG 
TCCTCCACTA 



TTTTACCCTC 
A6CATGAACA 
GTGTACCCaC 
TATTCATAAA 



ACTT CATAAC 
ACTGATTTTT 
AAAAATAAAQ 



11 



21 



31 



SI 



1380 
1440 
ISOO 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



1 I I I I 

MFGGCSRGFA AGDGRI/RLAR LALVIiLGWVS SSSPTSSASS FSSSAPFliAS AVSAQPPLPD 
QCPALCKCSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLAVLPAGA FARRPPLAEL 
AAUILS6SRL DEVRAGAFEH LPSLRQIiDLS HNPLADLSPP AFSGSNASVS APSPLVELIL 
NEIVPPEDER QtlRSPEGMW AAIiLAGRALQ GLRRLELASN HFLYLPROVIt AQLPSLRHLO 
LSNNSLVSLT YVSFRNLTHL E3iai.EI»IAL KVLHNGTLAE LQG1.PHIRVF LDNNPMVCDC 
HMADMVTWLK ETEWQGKDR LTCAYPEKMR NRVLLBLNSA DLDCDPItiPP SLQTSYVFLQ 
rVLALIGAIF LIiVLYIiNRKG IKKHMHHIRD ACSDKM&eXH YRVEINADPR LTNLSSNSDV 
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Coding sequence i 1..927 
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XI 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



ATGCCTGGGG 
CTAGCXJCTGG 
TTCTCCTCCT 
CAGTGCCCCG 
AATCTGACCG 
AACCAOCTGG 
AOCCTCAGGC 
OGCAACCTQA 
AATCWCACXX: 

cccraoGTCT 

GTGCAGGGCA 
TTGOAACTCA 
TCTTATGTCT 
TATTTGAACC 
CACATGGAAG 
AGTTCTAACT 



GGTGCTCCCG 
TACTCCTGGG 
CGGCGCCGTT 
CX3CTGTGCGA 
AGQTGCCCAC 
CCAGCAACCA 
AOCroGACTT 
CACATCTAGA 
TGGCTGAGTT 
GCGACTGCCA 
AAGACCGGCT 
ACAGTGCTGA 
TCCTGGGTAT 
GCAAGOGGAT 
GGTATCATTA 
CGGATGTCCT 



AAGTAATAAT 
AAOCCTCCAC 
GOUVGGTCTA 
CATGGCAGAC 
CACCTGTGCaV 
CCTGGACTGT 
TGTTTTAGCC 
AAAAAAGTGG 
CAGATATQAA 
CGAGT6A 



GCCTACGTGC 
CTGCCGCGGG 
TCGCTGGTGA 
CTGGAGGACA 
CCCCACATTA 
ATGGTGACCT 
TATCCGGAAA 
GACCCGATTC 
CTGATAGGCXS 
ATGCATAACA 
ATCAATGCGG 



GGOGTCTGCO 
CCACCTCCTC 
CCCAGCCCCC 
CAGTCAAGTG 
GCAACCTCTT 
ATGTGCTGGC 
GCCTGACCTA 
ATGCCCTCAA 
GGGTTTTCCT 
GGCTCAAGGA 



GGCATCCTOC 120 



CGTTAACCGC 
CCTTACCGGC 
CCAACTGCCC 
CGTGTCCTTC 
GGTCCTTCAC 
GGACAACAAT 
AACAGA6GTA 
TCGG6TCCTC 



1 11 21 31 41 SI 

ilPGGCSRGPA AGDGRLRLAH IJVLVUjOHVS SSSPTSSASS FSSSAPFLAS AVSAQPPIiPD 
QCPALCECSE AARTVKCVHR MLTEVPTDLP AYVRNLFLTG NQLASNHFLY LPRSVliAQliP 
SLRHLDLSNM 8I.VSLTYVSF RNLTHLESIiH LEDHALKVLH MGTLAE1X]QL PHIHVPLDMH 
FWVCDCHKAD MVTWUCETBV VQaKDRI.TCA YPBRMRtlRVL LEUISADUX: OPILPPSLQT 
SYVFLQIVLA IiIOAIPLIiVIi YUISXaiKKH KKNIRUACRD HMEGYHYRYB INADPHLTNIi 



41 SI 
I I 
GCCTGOSTTC TTCTGCTCAC 
TTGGGCTCAG ACCTOGGCCC 
CAGCACOTGC GGOACTGGCT 
GTGATGGAQT GTGACGCGTG 
CGCGCCCTGC TCCACTGCGC 



Seq ID HO: 408 E 
Nucleic Acid Accession fts NM_000095.1 
Coding sequence: 26.. 2299 



CAGCACCCAG 
CCTGGCTGCC 
GCAGATGCTT 



CCCCTGCCCC 
CGCCCACCCC 
GGCTTGCCCG 
GQCCAACAAG 
CCCCAACTCC 
CTTCGTGGGC 



GTCAGGGAGA 
CAflTCAaTAC 
TOCTTCCCCG 
GCGGGCTTCA 
TGCTTCCCCC 
CCGGGGTACA 
CAGGTTTGCA 
GTGTGCATCA 
GACCAGGCGT 



CCGCCATGGT CCCCGACACC 
CCGGACAGGG CCAGAGCCCG 
AGOAAACCAA CGCGGCGCTG 
TCACGTTCCT GAAAAACACG 
GCACCGCCCT ACCCAGCQTG 
OCGTGGCCTG CATCCAOACG 
CQGGCAACGQ CTCGCACTGC 
GAGTCCGCTG TATCAACACC 



CGGaCATCAA 
ACACCCGGGG 
CCGGCTGCCA 
AGCATGCAGA 



AGACGGCTTC 
CGTGACTGTG 
CGATOCQQAT 
GAACCCAGAC 
GTCCCAGAAG 



CCCAACTCAG 



CTCAOACCAG 



AACX3ACGACC 
GACGGCGACC 
AAGGACAGTG 
CCGGATCAGG 



AGCTGCGCTG 
GGCAGGAGGA 
ACGGGGTCCC 
CGGACGAGGA 
AAAAGGACAC 



CAATGAAAAG 
CAA6TG06GC 
AGACCAGGAC 



ACCGACGTCA 
AGCCCGGGGT 
GTGGGGCTGQ 
ACCGGGCAAC 
TGCGGCCCGT 
CAGCGCTTCT 
GAGCGCGATQ 
TGTGGTCGCG 
CAGTGCCGTA 
QATQGCATCO 
QACAACTGCC 



AOGAGTGCAA 
TCCGCTGCGA 
CTTTCGCCAA 
ATAACTQCGT 
GCCAGCCCGG 
GCCCCGACGG 
GCTCGCGGTC 
ACACTGACCT 
AGGACAACTG 
GABACGCCTG 



OSATCAAGAC 
TAACAGTGCC 
OGACAATOAC 



ATGGCGATGG TATAGGGGAT 
CGGATGTGGA CCACGACTTT 
AOGGACATCA GGACTCTCGG 
CAOACCACGA TGGCCAGGGT 
ACAQTCGGGA CAACTOCCGC 



GGCOQQGGCO ATGCOTGCOA 
AACTGCCCTA GGGTACCCAA 
GCCTGTGACA ACTGTCCCCA 
GTGGGAGATG CTTGTQACAG 
GACAACTGTC CCACGGTGCC 
GATGCCTGCG ACGACGACGA 



QOTGGTAGAC 
GGCCTTCCRG 
GGTGCTCAAC 
GGGTTACACT 
GGATGACGAC 
CATGTGGAAG 
GCCTGGCATC 
CGCTCTGTGG 
AAACGTGGGT 
GGGCTACATC 
CTTGGACACA 
CATCTGGGCC 
TCAGCTGCGG 
GOGGCTGGAT 
AAGGGCTCAO 



3 TGTGTCCGGA GAACX3CTGAA 



GCCTTCAATG 
TATGCGGGCT 
CAGATGGAGC 
CAACTCAAGG 
CATACAGGAG 
TGGAAGGACA 
AGGGTGOGAT 
ACCATGCGGG 
AACCTGCGTT 
CAAGCCTAGG 
GGGGGCTCTG 
AGASGACAAA 



OACAATGAAC 
CGAGGGCACQ 
CTACCAGGAC 
GCAGGCGAAC 
TTCCACAGGC 



TC6TTGGTTC 



GAGGACTTTO 
GTCACGCTCA 
CAGATTGACC 
AGOQACCCAS 
TTCCATGT6A 
AGCTCCAGCT 
CCCTTCCGTG 
CCOGGGGAAC 
CTGCTGTGGA 
CTGCAGCACC 



ATGCAQACAA 
CCQACTTCAO 
CCAACTGGGT 



ACACGGTCAC 
TCTACGTGCT 
CTGTGGCCX3A 
ASCTGOGGAA 
AGGACCCGOQ 



A6ATCGTGCA 
GCQTGGACTT 
TCATCTTTGG 
AAACGTATTG 
CTGTGAAGTC 
ACACAGAGTC 
AGAAGTCCTA 
TCTATGMQGO 
GTGGCCGCCT 
ACCGCTGCAA 
GACCAGGQTQ 

CACCCAGCCC AAGGGGTGGC CGTCCTGAGG GGGAAGTGAG 
ATAAAGTGTG TGTGCAGGG 



TOACACCATC CCAGAGOACT ATOAQACCCA 



1080 
1140 
1200 



1620 

leso 

1740 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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FLKNTVMECD ACGMQQSVRT GLPSVSPLLH CAPOFCFPGV ACIQTESGGR OGPCPAGPTG 120 

NGSHCTDVNE CNAHPCFPRV RCINTSPGFR CEACPPGYSQ PTHQSVGIAF AKANKQVCTD 180 

IHBCETGQHN CVPNSVCINT RGSFQCGPCQ PGFVGDQRSG C3QRGAQRFCP DGSPSECHEH 240 

W3CVI.ERDGS RSCVCRVGWA GNGILCGRDT DLDGFPDEKL RCPEPQC31KD KCVTVPNSGQ 300 

5 BDVDRDGIGD ACDPDADGDG VPNEKDMCPL VRMPDORMTD EDKMODACDN CRSQKNDDQK 360 

DTDQDGRGDA CDDDIDGORI RNQADHCPSV PHSOQXDSOa OGI^IACDNC PQKSNPDOAD 420 

VDHDFVGDAC DSDQDQDGDG KQDSRONCPT VPNSAQEOSD HDOQOISACDD DDDNDGVPDS 4B0 

RONCRIiVPHP GQEDADROGV GDVCQDOFDA OKWDKIDVC PEHAEVTLTD FRAFQTWLD 540 

PEGDAQIDPN WWLNQGREl VQTMNSDPOI. AVOYTAPNOV DFE6TFHVNT VTDDDyAOFI 600 

10 FGYQDSSSFY WMWKQMEQT VWQANPFRAV AEPGIQLKAV KSSTGPGEQIj HNALWHTGDT 660 

ESQVRLLWKD PHNVGWKDKK SYRHFLQHRP QVGYIRVRFY EGFBliVADSN WIiDTTMRGG 720 
RLGVFCPSQE NIIWAKLRYR CNDTIPEDYE THQLRQA 

Seq ID NOi 410 DHA sequence 
15 KUclelc Acid Accession »: HH_001S65.1 
coding sequence I 67.. 363 

1 11 21 31 41 51 

I I I I I I 

20 GAGACATTCC TCAATTGCTT AGACATATTC TGAGCCTACA GCAGAGGAAC CTCCAGTCTC 60 

AGCACCATGA ATCAAACTGC GATTCTGATT TGCTGCCTTA TCTTTCTGAC TCTAAGTGGC 120 

ATTCAAQOAa TACCTCTCTC TAGAACCGTA CGCTGTACCT GCATCAGCAT TAGTAATCAA IBp 

OCTGTTAATC CAAGGTCTTT ACSAAAAACTT GAAATTATTC CTGCAAGCCA ATTTTGTCC3V 240 

CGTGTTGAGA TCATTGCTAC AATGAAAAAG AAGGGTGAQA AQAOATGrCT GAATCCAGAA 300 

25 TCGAAGGCCA TCAASAATTT ACTOAAAGCA GTTAGCAAflG AAATGTCTAA AAGATCTCCT 360 

TAAAACCAGA GGGGAGCAAA ATCGATOCAG TGCTTCCAAO GATGQACCAC ACABRGGCTQ 420 

CCTCTCCCAT CRCTTCCCTA CATGGAGTAT ATGTCAAGCC ATAATTGTTC TTAGTTTGCA 480 

GTTACACTAA AAGCSTQACCA ATQATGGTCA CCAAATCAGC T6CTACTACT CCTGTAGGAA 540 

GGTTAATCTT CATCATCCTA AGCTATTCAG TAATAACTCT ACCCTGGCAC TATAATGTAA 600 

30 QCTCTACIGA GGTGCTATGT TCTTAGTGGA TGTTCTGACC CTGCTTCAAA TATTTCCCTC 660 

ACCTTTCCCA TCTTCCAAGG GTRCTAAGGA ATCTTTCTGC TTTGGGGTTr ATCftGA ATTC 720 

TCAGAATCIC AAATAACTAA AAGGTATGCA ATCAAATCTG CTTTrTAAAG AATGCTCTTT 780 

ACTTCATGOA CTTCCACTOC CATCCTCCCA AGGGGCOCAA ATTCTTTCAO TGGCTACCTA 640 

CATACAATTC CAAACACATA CAGQAAGQTA GAAATATCTG AAAATQTATG TQTAAGTATT 900 

35 CTTATTTAAT GAAAGACTGT ACAAAGTATA AGTCTTAQAT GTATATATTT CCTATATTGT 960 

TTTCAGTGTA C3«rGGAATAA CATOTAATTA AGTACTATGT ATCAATGAQT AACAGGAAAA 1020 

TTTTAAAAAT ACAGATAGAT ATATGCTCTG CATGTTACAT AAGATAAATO TOCTGAATGO 1080 

TTTTCAAATA AAAATGAGGT ACTCTCCTGG AAATATTAAQ 

40 

I r r r r i 

45 MtlQTAlrilCC liIFIiTliSOIQ GVPLSRTVRC TCISISNQPV mSSUBKLBX IPASQFCPRV 

EIIATMKKKG EKRCLHPESK AIKNLIiKAVS KEHSKRSP 

Seq ID NO: 412 DHA sequence 
Kucleic Acid Accession #: XM_OS7014 
50 Coding sequence: 143.. 874 

1 11 21 31 41 51 

i I I I I I 

GGGAGGGAGA GAGQCGCGCG GGTGAAAGGC GCATTGATGC AGCCTGOGGC GGCCTCGGAG 
55 CGCGGCGGAQ CCAGACGCTG ACCACGTTCC TCTCCICGQT CXCCTCOGCC TCCAGCTCCQ 

OGCTGCCCGG CAGCCGGGAG CCATGCGACC CCAGGGCCCC GCOGCCTCCC CGCAGCGGCT 

COGCGGCCTC CTGCTGCTCC TGCTGCTGCA GCTGCCCGCG CCQTCGAGCG OCTCTGBGAT 

CCCCAAGGGG AAGCRAAAGG OQCAOCTCCa QCAaAQGQAQ SIG6TGGACC TOTATAATOQ 

AATGTGCTTA CAAGGGCCAG CAGGAGTGCC TGGTCGAGAC GGGAGCCCTG GGGCCAATGQ 
60 CATTCCGGGT ACACCTGGGA TCCCAGGTCG GGATGGATTC AAAGGAGAAA AGGGGGAATG 

TCT6AGGGAA AGCTTTGAGG AGTCCTGGAC ACCCAACTAC AAGCAGTGTT CATGGAGTTC 

ATTCAATTAT GGCATAGATC TTGGGAAAAT TGCGGAGTGT ACATTTACAA AGATGCGTTC 

AAATAGTGCT CTAAGAGTTT TQTTCAGTGG CTCACTTCGG CTAAAATGCA GAAATGCATG 

CT6TCAGCGT TGGTATTTCA CATTCAATGG AGCTGAATQT TCAQGACCTC TTCCCATTGA 
65 AGCTATAATT TATTTGGACC AAGGAAGCCC TGAAATOAAT TCAACAATTA ATATTCATCG 

CACTTCTTCT GTQGAASGAC TTTGTGAAGQ AATTGGTGCT GGATTA6TG6 A3GTTGCTAT 

CTGGGTTGGC ACrTGTTCAG ATTACCCAAA AGGAGATGCT TCTACTGGAT GGAATTCftGT 

TTCTCGCATC ATTATTGAAG AACTACCAAA ATAAATGCTT TAATTTTCAT TTGCTACCTC 

TTTTTTTATT AT3CCTTGGA ATGGTTCACT TAAATGACAT TTTAAATAAG TTTATGTATA 
70 CATCTGAATG AAAAGCAAAG CTAAATATGT TTACAQACCA AAGTGTGATT TCACACTGTT 

TTTAAATCTA GCATTATTCA TTTTGCTTCA ATCAAAAGTG GTTTCAATAT TTTTTTTAGT 

TGGTTAGAAT ACTTTCTTCA TAGTCACATT CTCTCAACCT ATAATTTGGA ATATTGTTGT 

GGTcrrrTGT TTrrrcTcrr agtatagcat ttttaaaaaa atataaaagc taccaatctt 

TQTACAATTT GTAAATCfTTA AGAATTTTTT TTATATCTGT TAAATAAAAA TTATTTCCAA 
75 CAACCTTAAA AAAAAAAAAA AAAA 



80 1 11 21 31 41 51 

I 1 I I I I 

MRPQGPAASP QRI.RGLIJ.IiI> LliQLPAPSSA SBIPKGKQKA QLRQRSWDL YMGMCLQGPA 
GVPGBOGSPG AMGIPGTP6I PGRDGFKGEK GECLRESFEE SWTPHYKQCS HSSLNYGIDI. 
GKIAECTFTK KRSNSAIAVI. FSGSUILKCR KAOCQRWVPT PtiGAECSaPL PIEAIIYLDQ 

85 GSPEMNSTIN IHRTSSVEGI. CEGI6AGLVD VAIWVOTCSD YFKGDASTGH HSVSRIIIBE 
LPK 
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Nucleic Acid Accession fls XM_084007 

Coding sequence: 138.. 2405 
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AAATTAGTCC 
ATCATCTACA 
TCAGAAAATT 
ACCACQACCA 
ASCATCACTC 
CTGCTTCTGG 
OTAAAOATCC 
GAAGGAAT6T 



CCAAAOATGT 
TQGCTGGTAG 
GAAACACAAA 
GCATGOSCAT 
TCAACCAAAT 



11 
I 

ATTCGGCACG 
CGT6TGGAAC 
AGGCGCAATG 
AAATCCCCTT 
GAATTGGGAA 
ACRGCTTTTC 
ACTTCAAAAT 
TCACTCAGAC 
ASACCAOOA6 
TAAIVAATAAG 
TAOAAACAGC 
CAAGGACAGT 
AACTCACTTT 
AAGCAaCTCC 
OAAAACAAAT 
TGAAAATCCT 



\ i I 

" AGACCGCOTG TTCGOGCCTO GTAGA6ATTT 
CAAACCTGC6 
GCGAGOAAGT 
CATGAACTAA 
TCTGGCATTA 
TACCGCTATG 
ATAGGCATAG 
CACGAGCATC 
CATCACTCTQ 



TATCTGTAAT 
AAGCAGCTGC 
ATGTTGACTT 
QAGAAAATAA 
ATAAGATTAA 
ACTCAGACCA 



GQAAAAGCTC TRGCCCAaA, 



CrrGATOCTG 
TTTCCCCCAG 
GGCAATTTCC 
TTCTTTGTCA 
AAGAATCCAT 
TGAGCGTCAC 
TCACTCCCAC 
CCAXGACrCA 



ACCTTTGCX:C 
ACCACTGAGA 
ACACGGCAAT 



TCAGTTTCCT 
AATTTCTCCT 
TACACCTTCT- 
CAATGGAAAT 
GTQCCTATTT 
TTCTTGTTGA 
AGAAGAAACC 
CTCAACTTTC 



CTATTCATTA 
GTCTCTGCTG 
GAGTTTCCTT 
TCCACATTCT 
GAAAAGAGGA 
TGATTCCAOS 
ACATGTCXrrC 
TGAAAATQAT 
AACAAATGAG 



GTTAGTGCTA 
CTAGAGACAA 
ACrCCACCCA 
GAATCTGTGA 
C3«3GAaTGTT 
CTGAATGCAA 
TCTTQTCTGA 



GTGAAGTQAC 
TAGAGACTCC 
GTGTCACATC 
GTGAGCCCCG 
TCAATGCATC 
CAGAGTTCAA 
TTCATACAAO 



AAGAGGTCAT 
GGTGCAAGAA 
TTCACCACCA 
CTCACAGTCA 
TGGCCTGGAT 
GTGCTGCTTT 
ATGAGTTGCC 
AGCAGGCTGT 
GAATTTTCAT 



GATAGCTCAT 
TAAATGCCAT 
TCATGACTAC 
CAGCCAGCGC 
GGTGATAATG 



GQGGTTATCT 
GTGGCACTGG 
CATGCAAGTC 
CCACTTTTCA 
TGGAAGGGTC 
ACATTGATCA 
GATGATGTGG 
GAGAAAGTAG 
TCCCACTTTG 



TAGTGCCTCT 
CC6TTGGGAC 
AQCACCATAG 
GTCATCTGTC 
TAAC3«3CTCT 



ACCAGAAOIT 
ctcaactgtg 
AAGACCTGQA 
AAAGAGCCGG 
AAAAGGCTTT 
AAAGCTACTG 
CTATCTCraT 
TGAAAAGAAG 
TTTTATAGCC 



ATACACCATG 
XCAGACCATO 
CATAATCATG 
GATASTTCava 



TTTGAGTGGT 

TcaTAGcavr 

TTCTCAAAAC 
AGGAGGCCTG 
AGATAAGAAG 
GCAGTTGTCC 
TG6AACTGAA 



GCTCATCCAC AGGAAGTCXA CAATGAATAT GTACCCA6AG 



TCATGAATTA 
CCTTTATAAT 
tggtcattat 
GXATGTTQCI 
ATGTAGCCGC 
TATGTTACTT 
GTTTAAATGC 
GTTTGTATGC 

i ta: 



AAAAATCACA 



TAAGAATGTG 
TAAAGGAGAA 
AAATTTGTTG 
ATAGAGTACA 



TCACATTTCC 
CATCATATTC 
TACTCTCGGG 
GGTGATGGCC 
TTATCAAGTG 
GGTGACTTTG 
GCATTGTCAG 
GCTGAAAATG 
CTQGTTGATA 
TGG6GGTATT 
ATTTCCATAT 
TAGAGTAGCT 
TGTACTATGC 
TGTTACAAAG 
ATCTGTATGT 
ACATGTTCTG 
TATTCCTATA 
CATGAAGCCT 



TATTGCCAAG 
CAAAATTATC 
TCATTTGATT 
GAGCAATTGT 
GATOTTTCTT 



AATGAATTCA 
TTATATACCA 
TTATATATCA 



CGATTCAGAA 
CTTTATATAC 
TTTTACACAA 



TAAATTAGAG 
TTCATTAAAC 
CTAATTAGTG 
AGCAATATAC 
GATGAGTACA 
CCAAAAGCTG 
AACTTTGATA 
AGTACTTTGA 
GGTACTGTAG 
TAAATTCCTT 



AOQATACACT 
TCX^TCATCA 
AGQAGCTGAA 
TGCACAATTT 
GTTTAAGTAC 
CTGTTCTACT 
CCATGCTGGC 
TTTCTATGTG 
TGGTACCTGA 
TCTTTTTACA 
TTGAACATAA 
TAAAAAGTIG 
AGCGTTTAAA 
TCAGTTAAAG 
GCAATTCACC 
TATGTTTCAG 
CTGGATTTTA 
AAAATACCAA 
TCTGAGAATT 
GGGAGAAATT 
ATTTTTGTCA 
TACATTTAAC 



CCACCACCAA 



GTGAGCXXKC 840 
ATGTATTCCA 900 
ACATCrCATG 9S0 
CCAGCCATCA 1020 
GCTGAAA-rCC 1080 
ATTTCCATCA 1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



GATGCTTTTT 

GAAGAACCAG 
ATAGAAGAAA 
TATTTCATGT 
AAAAAGAATC 
AAGTATGAAT 
GGCTATTTAC 
TTGGAAGAAG 



GTGAGTAGTT 
TATGACTGGA 
TATATGAGQA 
TATCTCTCAG 
CCATACTAGG 
ATATCAGCTT 



CAGCGATGGC 
TTCTGTTGCT 
AAAGGCTGGC 
GTATCTTGGA 
GATATTTGCA 
AATGCTGCAC 
GAATGCTGGG 
AATCGTGTTT 
TCAT AGTTTC 
GTTAGTGGGT 
GTAOSTTTTA 
GGTATTACCA 
GGAAAAATQT 
GGTCTCTGAA 
GAAAGCTTAT 
GGGGAGGCAT 
TAGAATTAAG 
GGATTATTTC 
TTTGTATAAT 
OAAATTGOAA 
TATGTATCAC 
TGTTCTGGTT 
TATTAAAACT 
TGCTTCAGTG 
CCTGTCTGTG 



GACGATCTCA 
AACCACCATC 
GTCGCCACTT 
CTAGCSWVTTG 
GTGTTCTGTC 
ATGACCGTTA 
ATGGCAAC3U3 
CTTACTGCTO 
AAT GATGCTA 
ATGCTTTTGG 
CGTATAAATT 
ASTAGGTCAT 
TTTGTGATTT 
ATATTTAAGT 
GTTTATTATG 
CTTTAATGCT 
GAACTGCTGG 
ACTGAATTTA 
AGATTCTTAT 



TTTCAAAATQ 
CAGACTGQ8T 
ACCT6GTTTA 




a GLAIGAAFTE G 
i:. GMATGIFIGH S 
SWGYPPLQHA GMLI^FGIML LIS 

Seq ID HO: 416 OKA sequeni 
Nucleic Acid Accession #: 
Coding sequence I 1..8487 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
3460 
2520 
2S80 

3700 
2760 
2820 
2860 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3430 



I I I I I I 

MABKLSVIU IiTPALSVTNP LHBIiKAAAPP QTTEKISPNW BSGINVDLAI STRQYHLQQL 
FYRYGENNSL SVEGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHMSDHER HSDHEHHSDH 



GXLFPXDVSS STPPSVTSKS RVSRUUSKT 
LTSHGKQIOV PUIATBFMXL CPAIIHQIOA 
AISIISFLSI. liGVILVPUm RVFFKFLLSF 
HEEPAKEMKR GPLFSKLSSO NIEESAYFDS 
KKKNQKKPEW DDDVEIKKQI. SKYESQLSTH 
VLEEEBVMIA HAHPQEVraE YVPRGCKMKC 
QNHHPHSHSQ RYSRBEIiKDA GVATLAHMVI 
AVFCHELPHE LGDFAVLLKA GMTVKQAVLY 
AliTAGIJMYV A 
FRIMF 



MH_015419.1 
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1 H 21 31 41 51 

I I I I I I 

ATGCCCaUVOC GCGCGCACTG GGGGGCXXTTC TCCGTCGTGC TGATCCTQCT TTQGGGCCAT 60 

CCGOGAGTCG CGCTGGCCTQ CCCGCATCCT TGTGCCTGCT AOGTCCCCAG CGAGGTCCAC 120 

TGCACGTTCC GATCCCTGGC TTCCQTGCCC GCTGGCATTG CTAGACAOGT GGAAAGAATC IBO 

AATTTGOSGT TTAATRGCAT ACAGGCCCTG TOWSAAACCT CATTTGCAGO ACTGACOVAG 240 

TTGGAGCTAC TTATGATTCA a3<K»ATGftG ATCCCAAGCA TCCCCGATGG AGCTTTAACiA 300 

OACCTCAGCT CTCTTCAGGT TTTCAAQTTC AGCTACAACA AGCTGAGAGT GATCACAGGA 360 

CaVGACCCTCC AGGGTCTCTC TAACTTAATG AGGCTGCACA TTGACCACAA CAAGATCXSAG 420 

TTTATCCJVCC CTCAAGCTTT CAACX5GCTTA ACGTCTCTOA GGCTACTCCA TnGGAAGGA 480 

AATCTCCTCC ACXaUSCTGCA CCCXaGCACC TTCTCCACGT TCACATTTTT GGATTATTTC 540 

ACSACTCTCXav CCATAAGGCA CCTCTACTTA GCAGAGAACA TGGTTAGAAC TCTTCCTQCX: 600 

AQCATGCTTC QQAACATGCC GCTTCTGGAG AATCTTTACT TGCAGGGAAA TCCGTGGACC 660 

TGCGATTGTG AGATGAGATG GTTTTTGGAA TGGGATGCAA AATCC^GAGG AATTCTGAAG 720 

TGTAAAAAGG ACAAAGCTTA TGAAGGCGGT CAGTTGTGTG CAATGTGCTT CAGTCCAAAG 780 

AAGTTGTACA AACATGAGAT ACACAAGCTG AAGGACATGA CTTGTCTQAA GCX:TTCAATA 840 

GAGTCCCCTC TCAGACAGAA CAGGAGCAGG AGTATTGAGG AGGAGCAAGA ACAGGAAGAO 900 

GATGGTGGCA GCCAGCTCAT CXTTGGAGAAA TTCCAACTGC CCCAGTGGAG CATCTCTTTG 960 

AATAIGACCO AOOAGCACGG GAACATGGTG AACTTGGTCT GTGACATCAA GAAACCAATG 1020 

GA<roTGTAC3V AGATTCACTI OAACCAAAOG 6ATCCTCCAO ATATTGACAT AAATGCAACA 1080 

GTTGCCTTaO ACTTTGAGTG TCCAATGACC CGAGAAAACT ATGAAAAGCT ATGGAAATTG 1140 

ATAGCATACT ACAGTOAAGT TCCCGTQAAQ CTACaVCAGAG AGCTC ATGC T CAGCftAAGAC 1200 

CCCAGAGTCA GCTACCAGTA CAGGCAGGAT GCTGATGAGO AAGCTCTTTA CTACACAGGT 1260 

GTGAGAGCCC AGATTCTTGC AGAACCAGAA TGGGTCATGC AOCCATCCAT AGATATCCAO 1320 

CTGAACCGAC GTCAGAQTAC GGCCAAGAW3 GTGCTACTTT CCTACTACAC CCAGTATTCT 1380 

CAAACAATAT CCACCAAAGA TACAAGGCAG GCTCGaOGCA GAAGCTQGGT AATGATTGAG 1440 

CCTAGTGQAG CTGTGCAAAG AGATCAGACT GTCCTGGAAG GGGGTCCATG CCAGTTGAGC 1500 

TGCAAOGTGA AAGCTTCTGA GAGTCCATCT ATCTTCTGGG TGCTTCCAGA TGGCTOCATC IS 60 

CTOAAAGCGC CCATGGATGA CCX»GACAGC AAGTTCTCCA TTCTCAGCAQ TGGCTGGCTG 1620 

AGQATCAAGT CCXTGGAGCC ATCTQACTCA GGCTTGTACC AGTGCATK3C TCakAGTGAGG 1680 

QATQAAATGG ACCGCATGGT ATATAGGGTA CTTGTGCAGT CTCCCTCCAC TCAQCCAQCC 1740 

GAGAAAGACA CAGTGACAAT TGGCAAGAAC CCAGGGGAGT CGOTQACATT aCCTTOCAAT 1800 

GCrrTAGCAA TACCCGAAGC CCACCTTAGC TGGATTCTTC CAAACAGAAO GATAATTAAT 1860 

GATTTGGCTA ACACATCACA TGTATACATG TTGCCAAATG GAACTCTTTC CATCCCAAAG 1920 

GTCCAAGTCa GTGATAGTGG TrACTACAGA TGTGTGGCTG TCAACCAGCA AGGGGCAQAC 1980 

CATTTTACGa TGaOAATCAC AGTGACCAAG AAAGGGTCTG GCTTGCCATC CAAAAOAGGC 2040 

AGAOGCCCAG GTGCAAAGGC TCTTTCCAGA GTCAGAGAAG ACATCGTGGA GGATGAAGGG 2100 

GGCTOQQGCA TGtSGAGATGA AGAGAACACT TCAAGGAGAC TTCTGCATCC AAAGGACCAA 2160 

GAGGTGTTCC TCAAAACAAA GGATGATGCC ATCAATGGAG ACAAGAAAGC CAAGAAAGGG 2220 

AGAAGAAAGC TGAAACTCTG GAAGCATTCG GAAAAAGAAC C3U3AGACCAA TGTTGCAQAA 2280 

GGTCGCAGAG TGTTTGAATC TAGAOGAAGG ATAAACATGG CAAACAAACA OATTAATCGG 2340 

GAGCGCTGGG CTGATATTTT AOCCAAAGTC 08TGGGAAAA ATCTCCCTAA GGGCACAGAA 2400 

GTACCCCOGT TGATTAAAAC CACAAGTOCT CXATCCTTQA GOCTAOAAOT CACACCACCT 2460 

TTTCCTGCTG TTTCTCCCCC CTCAGCATCT CCTGTGCAGA C3«5TAACCAG. TGCTGAAGAA 2S20 

TCCTCAGCAG ATGTACCTCT ACTTGGTGAA GAAGAGCACG TTTTGGGTAC CATTTCXTTCA 2580 

GCCAGCATGG GGCTAGAACA CAACCACAAT GGAGTTATTC TTGTTGAACC TGAAGTAACA 2640 

AGCACACCTC TGGAGGAAGT TGTTGATGAC CTTTCTGAGA AGACTGAGGA GATAACTTCC 2700 

ACTGAAGGAG ACCTGAAGGG GACAGCAGCC CCTACACTTA TATCTGAGCX: TTATGAACCA 2760 

TCTCCTACTC TGCACACATT AGACACAGTC TATGAAAAGC CCACCCATGA AGMACGGCA 2820 

ACAOAGGGTT GGTCTGCAGC AGATGTTGGA TCGTCACCAG AGCCCACATC CRGTGAGTAT 2880 

GAGCCTCCAT TGGATGCrGT CTCCTTOGCT aAGTCIOAGC CCATGCAATA CTTTG ACCCA 2940 

GATTTGGAGA CTAAGTCACA ACCAOATGAa GATAAGATGA AAOAAOACAC CTTTGCACAC 3OO0 

CTTACTCCAA CCCCCACCAT CTGGGTTAAT GACTCCAGTA C»TCACAGTT ATTTGAGGAT 3060 

TCTACTATAG GGGAACCAGG TOTCCCAGGC CAATCACATC TACAAGGACT QACAGACAAC 3120 

ATCCACCTTG TGAAAAGTAG TCTAAGCACT CAAGACACCT TACTGATTAA AAAGGQTATG 3180 

AAAGAOATGT CTCAGACACT ACAGQGAGGA AATATGCTAG AGGGAGACCC CACACACTCC 3240 

AGAAGTTCTG AGAGTGAGGG CCAAGAGAGC AAATCCATCA CTTTGCCTGA CTCCaCACTG 3300 

GGTATAATQA GCAGTATGTC TCCAGTTAAG AAGCCTGCGG AAACCACAGT TGGTACCCTC 3360 

CTAGACAAAa ACACCACAAC AGTAACAACA ACACCAAGGC AAAAAGTTGC TCCGTCATCC 3420 

ACCAT6AGCA CTCACCCTTC TCGAAGSAGA CCCAACGGGA GAAGGAGATT ACGCCCCAAC 3480 

AAATTCC6CC ACOOOCACAA GCAAASCCCA CCCACAACTT TTGCCCCATC AGAGACTTTT 3540 

TCTACTCAAC CAACTCAAGC ACCTGACATT AAGATTTCAA GTCAAGTGGA GAGTTCTCTG 3600 

GTTCCTACAG CTTGGGTGGA TAAC3VCAGTT AATACCCCCA AACAGTTGGA AATGGAGAAG 3660 

AATGCAGAAC CCACATCCAA GGGAACACCA OGGAGAAAAC ACGGGAAGAG GCCAAACAAA 3720 

CATCXJATATA CCCCTTCTAC AGTGAGCTCA AGAGCGTCCXS GATCCAAGCC CAGCCCTTCT 3780 

CCAGAAAATA AACATAGAAA C»TTGTTACr CCCAGTTCAG AAACTATACT TTTGCCTAGA 3840 

ACTGTTTCTC TGAAAACTQA GGGCCCTTAT GATTCCTTAG ATTACATGAC AACCACCAGA 3900 

AAAATATATT CATCTTAOCC TAAAQTCCAA GAGACACTTC CAGTCACATA TAAACCCACA 3960 

TCAQATGGAA AAGAAATTAA GGATGATGTT GCCACAAATG TTGACAAACA TAAAAGTGAC 4020 

AXTTTAGTCA CTGGTGAATC AATTACTRAT GCC31TACCAA CTTCTOGCTC CTTGGTCTCX: 4080 

ACTATGGGAG AATTTAAGGA AOAATCCTCT CCTGTAGGCT TTCCAGGAAC TCCAACCTGG 4140 

AATCCCTCAA GGACGGCCCA GCCTGGGAGG CTAC3U3ACAG ACATACCTGT TACCACTTCT 4200 

GGGGAAAATC TTACAGACXX: TCXXXTTTCTT AAAGAGCTTG AGGATGTGGA TTTCACTTCC 4260 

GAGTTTTTGT CCTCTTTGAC AGTCTCCACA CCATTTCACC AGGAAGAAGC TGGTTCTTCC 4320 

ACAACTCTCT CAAGCATAAA AGTGGAGGTG GCTTC3VAGTC AGGCAGAAAC CACCACOCTT 4380 

GATCAAGATC ATCTTGAAAC CACTGTGGCT ATTCTCCTTT CTGAAACTAG ACCACAGAAT 4440 

CACACCCCTA CTGCTGCCCG GATGAAGGAG CCAGCATCCI CGTCCXCATC C31CAATTCTC 4500 

ATGTCTTTGG GACAAACCAC CACCACTAAG CCAGCACTTC CCAGTCCAAO AATATCTCAA 4560 

GCATCTAGAG ATTCCAAGGA AAATGTTTTC TTGAATTATG TGGGGAATCC AGAAACAGAA 4620 

GCAACCCCA6 TCAACAATGA AGGAACACAG CATATGTCAG GGCCAAATGA ATTATCAACA 4680 

CCCTCTTCXX5 ACXXK3GATGC ATTTAACTTG TCTACAAAGC TGGAATTGGA AAAGCAAGTA 4740 

TTTGGTAGTA GGAQTCTACC ACGTGGCCCA GATAGCCAAC GCCAGGATGQ AAGAGTTCAT 4800 

GCTTCTCATC AACTAACCAG AGTCCCTGCC AAACXI CATO C TACCAACAGC AACASTGAGQ 4860 

CTACXTTGAAA TGTCOVCACA AAGCGCTTCC AGA TACT TTQ TAA CTTOC CA aTCACCTCOT 4920 

CACTGGACCA ACAAACCGGA AATAACTACA TATCXHrTCTO GGGCTTTGCC AGAGAACAAA 4980 

CAGTTTACAA CTCCAAGATT ATCAAGTACA ACSUtTTCCTC TCCCATTGta CATGTCCAAA 5040 
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CCCAOCATTC CTAGTAAGTT 
6TGTTTGGAA ATAACAAOVT 
AGAATTCCTC ATTATTCCAA 
CCACa«3TTGG GAGTCACCCX3 
GAGAGAAAAG TTATTCCAGG 
GACTTTOGCC CTCCGGCACC 
ACTAACTTAC AGAATATCCC 
TCTTCTGTCC AGTCCTCAGG 
CCTCCTGCAT CCAAATTCTG 
CAGACTGTGT CCGTCACCGC 
CCAAAGCCTT TCGTTACTTG 
AGGATACAAC GOTTTQAOaT 
CAAGATOGAG GCXAGTATAT 
GTCTTGCTTT CX3GTCACCGT 
ACTGTCTACC TGGGAGACAC 
CAAATTTCCT GGATCTTCCC 
OOCATCACCC TGCACGAAAA 
GG06TCTATA AGTGCGTGGC 



TACTOACOGA 
CCCTGAGGCA 
TGGAAGACTC 
GAGACCCCAG 
TTCCTACAAC 

TATGGTCTCT 
AAGCTTCCAC 
GTCTCTTGGG 
TGAGACAGAC 



AGAAACCCAO 
CCTTTCTTTA 
ATACCC3VCTT 



\ GCATICACAT 



CCGCCAACCT 
CCAACGCX3CG 
TCAAGCTGGA 
CCAAGAGGAT 



TTTGCCARTG 
TOCGTAGCTC 
AAACGGGCCA 
CTGAAAGTGG 
GAOSGGAGTC 
TATGTCOTCT 
GACTACACCT 
GTGGTGACAO 



GAAATAAGGT 



ACTGTGTGGC 



CATTGCAATG 
TGACAGGAGG 
CCGGACCCTT 
CAGCAATGCA 
CGTTATCCAC 
TCACIGCACT 
CCAOATCOGC 
CTACATCCX3C 
GGTAGGCTCC 
CATCACGGGC 
CTGCAGOSCC 
GATCGACGCG 
GGTGAAATCA 
TGGTGATGAC 
CAAGGAGGAG 



TCCACCCAGA 
CAGAGCAGCT 
GAAAAGCCCC 
ACTGTGTTCC 
TCCACAOGAG 
GQTACCTTAG 
AGCAACCTGC 
CAAATCCTAG 
GAGTGTCTGG 
GTGTGGCAAA 
TCCATCAAGG 



CTCCTGCCCC 
CCCATAGCAC 
AGACCACX3GG 
GTTCTAT CTC 
CAAAGTTCTT 
AAATCCTCAC 
CCTGTGAGGC 



TSATACQGAA 
ACGGCCTGGA 
CCTCCCACTA 
CX»AAGGGAC 
CTGTGTCCCX: 
AGGCGTCCTT 



CTTTATAACA 
TGCAGGAGGA 
CAAGTCCCCA 
AACAG6AAAA 
TCCGAATACC 
GGTTCAAGTA 
CAGGATGGTG 



CCX3VGCCCCC 
CGTGGAGAGC 
CTCAGACAGA 



AACxrrcxjCGC 

GCGCGCAGQA 
ACCTCCCX33C 



TGGAGAACAT 
CGCCCCTGCC 
TCCTCGACGG 
CCAAGGACAG 
CGGTGCRGCT 



TTGTCCCCAA 
GGCACTCrCC 
AGGAACAGCG 
AACATCAACG 
CGGAAACTGA 



GGTTCCCTGG 
CGCAACQAGG 



GOC»COGATC 
CTACACATTA 
GCCGCTGGCC 
AAGCAGTATC 



TCAACAATGG 
GCTTTGCTGA 
CGCCCX3CCAC 
TCACTGTAGC 
CCAACAAGGT 
TTATTCAGAA 
CGGGAGAGGA 
GTAACCCCAA 
TTGACTGCAA 



TCCA06ACCC 
GCTCTGCCGC 
TGCAGA6TGO 
GCGGTCTCTC 
ACACGGAGAG 
ATAACCTGGT 



CTTCATGCAG 
GACACTCTAC 
AAATCAGGTC 



CTGTGAGGCC 
GATCCCCACC 
AGCCCAGCGT 



GTGACGGACA 
TACGTGGTGC 
AAOGACCACA 
CCCAATC!CCG 
TCQGATGAC* 
TTTAACGAAG 
GGGAAGGA06 
AAQACTTACT 
AAAGGAGAAC 
TCCTCTGAGA 
TCTGACAGCG 
QTGTGGATTC 
ACXX3TGCGGG 
ATCCCCACCC 
TATGGAAACC 



CTCGCTGCCC 
CAGCXnxSCGC 
GAACTTGTTT 
CGGGCGCTAT 
GAACGTGCAQ 
OSTCAGGTAC 
CATCCTCTGG 
AATCAAGGTG 
AGATTACCTG 
TGTGGTGATG 
OGGGGGTGAC 



CCTGGCCGCG 
TTGATAGCAG 
AAGATGCCGG 
TCAAAGTGGA 
AAGTCTTCTA 
AGATCTCCTG 
GCOGTGGACG CACCAAGCGC 



AGATGAGAGT 
TGGCGGTTCA 
CCATGCCCAA 



GCAACTACAC 
ACGTCAACGT 
AGATAGCAGC 
CGA6GGTGTT 
GGATCACTGT 



CAGAGTCAAG 
GQTQCQCTAT 
GGTGACTTGG 
ATACCAAGAT 
TGGTC 



5100 
5160 
5220 
5280 
S340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 

6120 
6180 
6240 

6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
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CCAGCCACCC 
CGGGGGCAGT 
ATGGGCTTTT 
CCATGGCAAC 




GGCCCCX»AA CCCCGGGACG 



ACAGCAGCTG 
CTCGGTGGAC 
GCTGGTCTCC 
CAGCRTCATC 
ACGTTTCTCC 
CGTTTCTCTT 



GTCACCAGCA TCCCCGTGAT 
CCGGTCATCT ACACCCGGCC 
CCCAAAGCTG ACATCAOGTa 
GCTCGTCTGT ATGGAAACAG 
! ATGCCGGCl 



TGTGATOGCC 
CGGGAACACC 
GGAGTTACCXa 
ATTTCTTCAC 
CTACAAGTGC 



TCTTTCAGTT 
CAGAGTGACT 
TTTATATGAA 
TATATATTTT 
ATTAAAATTA 
ATATAATTTT 
CCTTCTCCAG 



ATACTGTACA 



TGCAAAT6CC 
AAAATAAGCC 
AAOCTGCTGC 
ATTTCCTCTG 
GATATATATA 
AAAA6AAAAA 
TTCCTTTCAA 
ATAAATTATT 
AAAAAATTTC 
GAACCCTCCA 
AAGCTGTGCT 
AATTGAATCr 
CATCTGGTCT 
TACACGACCT 
TTTGATAATA 



ACTCGACTGC 
ATAGACATGA 
AGTTTTTACA 
TCACTTCRAA 
TATATATTTT 
CATTTCTTCC 
ATCAGAGGAT 
GGTCTTTACA 
TCTCCAACCT 



AQAJ3TCTTCC 
AAGGTGGCTG 
GTTATTTCCA 
AAATAATATT 



GCTGGGGCCT 
CTGAAGGTGG 
AATG6TGAGA 
TG6ACGCTCC 
CTGGACAATG 
TGCAOGATGG 
TATCCTCOCC 
QTGAAACTGA 
6ATAAGT06C 
CCCCAGGGAT 
ATGGCAAAAA 
GTGGATTCCA 
GGTTGGGGAA 
TCaMGTTGAG 



CTTCATAAQC 
ACAACACCTC 
TGATAGACTT 
ACTCCAGCTT 
AATTCAGAGT 
TGGAACTCAC 
GAGACTAGAA 
AGACTTGGAT 
CCTTCAAATT 
CTGCGATATT 
GAGAGGAGA6 
CCGAAAAGCC 
CTTCTTCCCC 
TGACTGCTTT 
CTCCCAAAAA 



ACCACAAGGC 
ACCGCIGCGT 
GACTGAAGCC 



GGCCCGCAAT 
AGAAGCAAAC 
CCCCTGCACC 
GCATCTGGAQ 
GGTTOGTGAa 



ATCTGAAGGC A 
CACTGACCAT C 
ACATTCTCGG C 
GAATGATTGC 1 



GTCCATAGCSV 
ACTACCCCAT 
TGTTCCAGAT 



TACATACATA 

GGAGAAATAC 
ACATTACAGC 
CAGTCACCAC- 
AGATTTCCTT 



CAGAAACrCC TCIGCAGTAT 
AGCCATOAOT CRGTTTGTGC 
ACTGTATTTT TAAGGTCAAT 
AAAAA 



11 



31 



41 



51 



7380 
7440 
7500 
7560 
7620 
7680 



7980 
8040 
8100 
8160 
8220 
8280 




TGAAGAOGCA 
TGACAAGTCA 
GATTTAGAAC 
CAGCTACCAT 
ATGTTTTATA 

AGACATGGAA 
TGTTATATTA 
GTATGCAAAG 



8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 



I I I I I 

MPKRAHWGAL SWLILLMGH PRVALACPHP CAOfVPSEVH CTFHSIASVP AGIARHVBRI 
HLGFHSIQAL SETSFAGLTK LELLMIHGNE IPSIPDGALH DLSSLQVFKF SYHKLRVITG 
QTLQGLSHLM RLRIDHNKIE PIBFQAFNGL TSUtU>HI<EG NLUCQI^ST FSTFTFLOYF 
RLSTIRHLYL ABNHVRTLPA SMXiRNMPLIiE HI.YLQGtIP»T CDCEMRWFIS WDAXSKGILK 
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VIASVYTQyS 
IFWVIiPDGSI 
LVQSPSTQPA 
LPKGTLSIPK 
VRBDIVEDEG 



WO 02/086443 

CKKDKAYEGG QI,CAMCPSPK 
DGGSQIiIIiBK FQLPCJHSISIt 
VAUJFECPMT RENYEKLHKI. 
VRAQILAEPE WVMQPSIDIQ 
PSGAVQROQT VLEGGPCQLS 
RIKSMEPSDS GLYQCZAQVR 
AliAIPEAHLS WILFNRRIIN 
HPTVGITVTK KGSGI.PSKRG 
EJVPLKTKDDA INGDKKAKXG 
ERWAOILAKV RGXHLPKGTB 
SSADVPLLGE EEHVLGTISS 
TBGDIiXOTAA PTLISEPYSP 
EPFLDAVSLA ESGPMQYFDP 
STIGEPGVPG QSHLQGI.TDN 
RSSESEGQBS KSITLPDSTL 
TMSTHPSRRH PNQRRRLRPN 
VPTAWVnNTV NTPKQLEMEK 
PBHKHRNIVT PSSETILLPR 
SDGKEIKDDV ATNVDKHKSD 
NPSRIAQPGR LQTDIFVrrS 
TTLSSIKVEV ASSQAETTTIi DQDHLBTTVA ILLSETRPQH 
MSLGQTTTTK PAI.PSPRISQ 
PSSORDAFNIi STKLELBKQV 
RYFVTSQSPR 
RTDQFNGYSK 
IPTSPAPVMR 
STQS3ISFIT 



PCT/US02/12476 



QTISTKDTRQ ARGRSHVHIE 480 

LKAPMDDPDS KFSIIiSSGWL 540 

BKDTVTIQKN PGESVTI.PCH 600 

VQVSDSGYYR CVAVNQQGAD 660 

GSGMGDEENT SRRIJ<HPKDQ 720 



GRRVFESRRR 
PPAVSPPSAS 
STPLEEWDD 



SPTIjRTLDTV 
DLETKSQPDB 
IHIjVKSSIiST 
GIMSSMSPVK 



NAEPTSKGTP 
TVSLKTEGPY 
ZI.VTGBSITN 



DKMKSJTPAH LTPTPTIWVN 
aDTIililKKOK KEMSQTIK2GG 
KPAETTVGTL IiDKDTTTVTT 
PTTPAPSETP STQPTQAPDI 
HRYTPSTVSS 
KIYSSYPKVQ 



IKMAKKQINP 
PVQTVTSAEE 
LSEKTEEITS 
SSPEPTSSEY 
OSSTSQI.FED 
NMLEGDPTHS 
TPRQKVAPSS 
KIS^pVESSL 



HWTNKPEITT 
VFGNNNIPEA 
ERKVIPGSYN 



VWQTVSPVES 



NLAPKDSGRY 
SGDPHPRIIiH 
YWLKVDWM 



PKPFVTWTKV 
VliSVTVQQP 
RITLHENRTIi 
PGLSiaiKCT 
ECVAANLVQS 
RI.PSKRMIDA 
KPAKIEHKEE 



RIHSKSTFia. 
QSSSKFFAGO 
STGALMTPirf 
QILASHYQDV 



AKAAPLPSVR 
ARRTVQUIVQ 
I^SFDSRIKV 



PSIPSKFTOR 
PQLGVTRRPQ 
TNLOmPMVS 
QTVSVTAETD 
QORGQYMCTA 
QISHIFFDRR 
HVAALPPVIH 
VFPNGTLYIR 
GQTLKIiDCSA 
CVAHNKVGDD 
DGSLVNSFMQ 
WTAPATIRN 
GTUiIQKAQR 
RKLIDCKAEG 
RNEGGEARIiI 
GTDLQSGQQL 
^YHNLVSII 
ASVFDRGTYV 
PKADZTNELP 
KTTYIHVF 



Seq ID NO I 41B DNA sequence 

Nucleic Acid Accession ft: E08 sequence 

Coding sequence: 1..5001 



EFLSSIiTVST 
HTPTAARMKB 
ATPVNNEGTQ 
ASHQbTRVPA 
QFTTPHLSST 



PVGFPGTPTW 
PFHQEBAGSS 
PASSSPSTIL 



DFGPPAPPI^ 
PPASKFWSLO 
RIQRFEVUqJ 
TVYLGDTIAM 
GVYKCVASNA 
HVLGDGTQIR 
RAAANARITG 
7ANQTLWKS 
LKVDCVATGL 



KPII.PTATVR 
TIPLPLBMSK 
PPPTNKT1.SP 



EKFQILTKSF 
QTIiVIRKVQV 
ECIAKGTPAP 
AGADSLAIRL 



KTYIAVQVPY 
SDSGNYTCLV 
IPTPRVLWAP 
VQI,TVLEPME 
QRFYBKADGM 
KOETLKLPCT 
CRMBTEYGPS 
DKSHLXA6VQ 



GDWTVACBA KGEPMPKVTH LaPTNKVIPT SSEKYQIYQD 
RNSAGEDRKT VWIHVNVQPP KIHGHPHPIT TVREIAAGGS 
PBGWLPAPY YGtlRITVHQN GSU>IRSLRK BDSVQLVCMA 
KPlFHDPrSE KITAMRGHTI SUICSAAGTP TPSLVWVLPN 
LHISGIiSSVO AGAYRCVARM AA6BIERLVS IiICVGI>KPEAN 
PPGAGQGRPS WTIiPHGHHIiB OPQTLQRVSL IiDNGTLTVRE 
VTSIPVIVIA YPPRITSBPT PVIYTRFGMT VKLHCMAMOI 
ARIiYCaiRFIiH FQGSLTIQHA TQRDAGFYKC NAXNILOSDS 



1200 
1260 
1320 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



55 
60 
65 
70 



ACCTCTCAAO 
CAGTCTGTGC 
TCAAGACAGT 
CAGATCGCTA 
GCAGTCCGTA 
ACACCAGAAT 
AAACCTACAG 
GTCTGTCTGC 



I 

CAAAACTAAC 
AGGACGAATT 
TTGTGTCCTG 
ACACCGTGCG 
ACAGGCGTGT 
TTTCACAGQQ 
CTGCCCCTAC 
TTGTCGCTGC 



GGATGTACCT 
GGTGGATCCT 
CTATOGAGAG 
GCTGATTGAG 
TOAAAGAGAT 
CACAGCTCCr 
ATCTTGGGAT 



GTTCTGGAAA 
AAGGGGGAAT 
AACCTGATTC 
GGCAAATGGA 
GAAAACTTGA 
GOSCTACCAG 



CCTATCCTGO AOACACTACT TCTGCCCTGS TCGATOSTCr 
TTTTCAAAAT CCX3GGCCACA AACAGGAGAQ GCXnGGGACC 
TCSCTATGOC AACAAGAAT6 CAGCTGTACC CAGAAGOATT 
ATCOATATCC AAACCAAACA A6TTAATAAA GATCCACAAC 
CCATGTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA 
ATCTGCTATG AAGACCCANN TGTTTCrrCT TTGACAGGCA 
GCCAGTAAGG CGGATGTTCA GCAGAACAGG GAGGACAAT6 
CCTTCCTCAC CTTCTCCCAG JU3CTCCAGCT TCCTCCCAAC 
CCCCAAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTGA 



CAGACACTGT 
GTACGTCAGT 
ACGTCTGQCC 
AGACTGAGGG 
TCCAACCATC 
ATTTGGAGCA 
GCAQCCTGGG 
TCACTCCAAA 
TCAGTTGTCT 
TGGAAGGGAG 
TTGGCGGCTT 



GATATTGAAG 60 

TRTOTCATCT 120 

AGTTGTTGC3V 180 

GGATTATAAG 240 

GTATGAATTT 300 



TCACTGGGGA GOAGGAGCTG GGTTCCOGGG 
AAGACCAGAA AOGGACCCTG AGGCCGCCAA 
GCAGGACTGC AGTGAGGGCX: CGGATGCCAG 
CTGGCTTTTC CCTGGCCACG CAGCCCOGCC 
CTQCCCACCA CGCGTCCACC CAGGGCACCT 
ATQACAACGA CTTGGTGGAC TCAOAOJAAG 
CTCCACCCCA AGGGCGCXTTT CGCCCAGCCC CGGCCAGCCC 



TCGACA6AAA 
TCAGACACCC 
GTTGCTCCCG 
GTAGATAAGC 
TCGGCCTCTC 



GGAAACCCGA 
ACCCCTCTGT 
AGAACAAAAT 
CAGAGGAGCr 
AGGACTCGCC 
GTAGACACGG 



AGCTCTCCAC 
GCGCCATCAC 
GTCTCTGAGG 



TTCCATTTGC 



COOCCCATTC 
CTCCACCCCA 
TTTCCAAGGG 
GGTCCACCAT 
GAGCGGAGGC 
AGGCGGAGGC 
TCAGACACAA 



CAQAAOCrCT 
AGGG6CGGCA 
TGGGGGATCA 
CGGGAAGGAT 
GTCCTCCTCC 
TTCTGATGGT 
CACGQCCCAG 
ACCCTTTGCT 
GCAGCCCTCC 



GTGCACCCCG 
GAGGAAGATT 
TCTCGGCTGC 
GGTGAGGACG 
GTCTCTTCTC 
GAAAGCCACQ 
ACGCTGCGGG 



CTCATCGTCC 
ATQAGCGCGC 
TGTCCCCOVG 
GCGCAAAOCC 
CCAGTGCCTC 
TGCCCACCCA 



TGCCAAATCA S40 
AAGTCCCTCA 600 
GAACGCTATC 660 
GCCTTCATTG 720 
AG CTTACCTG 780 
TCTTTTTGOA 840 
TTCCTTCATT 900 
ATCTGTTGCA 
AAAACCTGAG 
GCCTGCTTCT 
ATTGGCTAAT 
GGATCTTCAG 
CATGTCACCC 
CCACTCGGTQ 
AAGGGAAG6C 
CCCCTCGGCT 



CCGCCAGTCC 
AGCCTCGCCG 
AGCCCCACCC 
GCCACACCTG 
CAACTCCAAT 



AGCTCCCCAC J! 



1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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wo 02/086443 

GCCCACXTCCA GGGTTCCCTC TCACTCTGAT TCCCACCCTA AGCTTAGCTC AGGTATCC»T 2220 

GQAGAOQAGQ AGGATGAGRA GCCGCTTCCT GCCACCGTTG TCAATGACCA CGTGCCTTCC 2280 

TCCTCCRGGC AGCCCATCTC CCGGGGCTGG GAGGACTTAft GGAGAAGCCC GCaGAGAGGG 2340 

GCCAGCXTTGC ATaSQAAGSA ACCCATCCCA GAQAACCCCA AATCCACAGG GGCAGATACA 2400 

CATCCTCAGG GCAAGTACTC CTCCCTGGCC TCCAAGGCTC AGGATGTTCA ACAOAGCACA 2460 

GACGCGGACA CGGAGGGTCA TTCTCCCAAR. GCACAGCCAG GGTCCACAGA CCGCCACGCG 2520 

TCCCCTGCTC GTCCrCCCGC AGCACGGTCA CAGCAGCATC CCAGTGTTCC CAGAAGGATQ 2580 

ACACCCGGCC GGGCCCXAGA ACAGCAGCCC CCTCCTCCCG TCGCCACGTC CCAGCACCAC 2640 

CCGGGACCCC AGMCAGAGA CX3CGGGTCX3Q TCACCTTCCC AGCCCAGGCT CTCACTGACC 2700 

CAGGCGGGGC GGCCCCGCCX: CACGTCGCAG GGCCGCTCCC ACTCCTCCTC GGACCCTTAC 2760 

ACGGCGAGCT CCAGAGGGAT GCTCCCCACG GCCCTCCAGA ACCAGQACGA GGATGCCCAG 2820 

GGCAGCTACO AGQAC6ACAG CACAQAAQTC GAGGCCC3U3G ATGTGOGGaC CXXXKCGCAC 2880 

GCCGCGCGCG CCAAGGAGGC AGCTGCGTCC CTTCCCAAGC ACCAOCAGGT GGAGTCTCCC 2940 

ACAGGCGCAQ GGGCAGGTGG GGACCACAGG TCCCAGCQCG GACAXGOGOC CTCCCCOGCC 3000 

AGGCCCAGCC GACCCGGCGG CCCCCAGTCX: CGCGCCCGGG TCCCCKGCAG OGCAGCGCCO 3060 

GGGAAGTCGO AGCXTCCTTC CAAGCGGCCC CTGTCCTCCA AGTCCCACCA GTCOGTCICA 3120 

GCCGAGGAOG AGOAGGAGGA GGACGOGGGG TTTTTTAAAG GOOGOAAASA AGAOCTTCTS 3180 

TCTTCCTCTG TGCCAAAGTG GCCCTCTTCC TCXIACTCCCA GGGQOGGCAA AOACGCCGAT 3240 

GGGAGCCTOG CC3UU3GAAGA GAGGGAGCCT GCCATCGCX3C TTGCCCCTCG CGGAGGQAGC 3300 

CTGGCTCCTO TOAAGOGACC TCTCCCCCCA CCTCCMGGCa OCTCCCXCAa GGCCTCCKAC 3360 

mcccncac oaccocogcc tcgcagcgct Gcx»cccrrGA gcccogtcgc gggcaccc»c 3420 

CX:CTGaCCGC GGTACACCAC QCGCGCCCCV CCTGGCCACT TCTCCACCAC CCCGATGCTG 3480 

TCCTTGOSCC AGAGGATGAT GCATGCCAGA TTCCGTAACC CTCTCTCCCG ACA OCCTG CC 3S40 

AGACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAQGGAA AOTCCTTCCT 3600 

GGTAGTAATG GAAAACCQAA TGGACAGAGA ATTATCAATG GCCCTCAAGG AACAAAGTGG 3660 

GTTGTGGACC TTGATCGTGG GTTAGTATTG AATGCAGRAG GAAOOTACCT CCAAOATTCA 3720 

CATGGAAATC CTCTTCX3GAT TAAACTAGGA GGAGATGGTC GAACCATTGT AGATCTGGAA 3780 

GGGACCCCCO TaGTGAOTCC TGACGGCCTC CCACTCTTTG GGCAGGGGCSj ACATGGCACA 3840 

CCrCIGSCCA ATGCCCAAGA TAAGCCAATT TTGAGTCTTG GAGGAAAGCC GCTGGTGGGC 3900 

TTQGAG8TCA TCAAAAAAAC CACCCATCCC CCTACCACTA CCATGCAGCC CACCACTACT 3960 

ACQACGCCCC TGCCTACCAC TACAACCCCG AGGCCCaCCA CTGCCACCAC C»TGC3«3CXX: 4020 

ACCACTACTA CGACGCCCCT GCCTACCACT ACACCGAGGC CCACCACTOC CACCACCOGC 4080 

CGCACGACCA CCAGGCGTCC AACAACCACA GTC06AACCA CTAOSaSGAC AACCACCACC 4140 

ACCACCCCCA AACCCflCCAC TCCCATCCCC ACCTQTCCCC CTGGBACCTT OGAAOGGCAC 4200 

GACGATGATG GCAACCTGAT AATGAGCTCC AATGGGATCX: CAGAGTGCTA CGCTGAAGAA 4260 

GATGAGTTCT CAGGCTTOGA QACTGACRCT GCAGTACCTA CGGAAGAGGC CTACGTTATA 4320 

TATGATGAAG ATTATGAATT TGAGACCTCA AGGCCACCAA CCACCACTGA GCCTTCGACC 4380 

ACTGCTACCA CACCGAGGGT GATCCCAGAG GAAGGCGCCA TCAGTTCCTT TCCTGAAGAA 4440 

GAATTTCATC TGGCTGGAAa GAAACQATTT GTTGCTCCTT ACGTGACGTA CCTAAATAAA 4S0O 

QACX:CATCAG CCCCGTGCTC TCTGACXGAT GCACTGGATC ACTTCCAAGT GGACAGCCTG 4550 

GATGAAATCA TCCCCAATQA CCTGAAOAAG AGTGATCTGC CTCCCCAGCA TGCTCCCCX5C 4620 

aacatcaccg tggtggcogt ggaaqgttoc cac tcatt tg tcattgtgga ttggoacaaa 4680 

gccaccccag gagatttg6t cacaggttat riuutitaca otgcatocta tgaagatttc 4740 

atcaggaaca agttttocac tcaagcttca tcagtaactc acttgcccat tqaoaaccta 4800 

aagcccaaca cga6siatta ttttaaagtg caagcacaaa atcctcatgg ctaoggacct 4860 

atcagcx:ctt oggtctcatt tctcaoogaa tcagaxaatc ctctocttgt tgzgaggccc 4920 

ccaggosgtg agctatctgq atcccattog ctttcaaaca tgatccc3vgc tacacg gact 4980 

gqatgoacg gcaatatgtg aagogcacgt ggtatcgaaa gttarrggga gttgttcttt 5040 

gtaattcact gagstataaa atctacctca gtgacracct gaaagataca ttcxacagca 5100 

ttggagacag ctggqoaaqa gqtqaaaacc attgccaatt tgtggattca caccttgatg 5160 

gaasaacausg gcctcaotcc tatgtagaag ccctccciac tattcaaggc tactatcgcx: 5220 

AGTAT08ICA GGAGCCTGTC AGGITTGaGA ACATCGGCTT OGGAAGCCCC TACTACTATG 5280 

IGGGCTGGTA CGA6TGTGGG GTCrCCATCC CTGOKAAOTa 8TAATCACAG GACCSTCATG 5340 

CTGCAAGCTT GCCCTGCCCA GCCCCACCAA CTAA6TCGCR CTAGGGGCTG TGAGC3UVAGA 5400 

CAQCCAaCAT GCTCAGCCCC QCTGCCCTAG GTGCCAGGAA GGTCACftGAT GGACACTGGC 5460 

CATTCTGGTC ATCTCSM3TCT GGAACTCAGT CCXACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCAGTTTTG CTGTTAACTT TGCTTCTCTA CTTTTTTTTG TTTGTTTGTA ATAGCACATC 5SB0 

CCAGAGACAT CAQAAACX31.G CAACTGATTC AGTGTGATTT CCCAGACTTT TTAGGCATGA 5640 

AATTCGGACA CTTCAGTATT TCCAGGAATA GCATATGCAC GCTGTTCTTG CTTCATGOAA 5700 

TaCTACATCC TTlX.' i G T TTT TCTCATTTTG GATTTCTCCA AAACTAACTG AATTTAAGCT 5760 

TCAGGTCCCT TTCTATQCAG TAQAAAOQAA TTATTAAAAA CACTACCAAA GAAAATAAAT 5820 

ATATCCTACT TGAARTTTAC TCTATGGACT TACCCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTCTAA ATTCTCAATT TTGATATATA TATGTATATA TGCATATACA TATCCACACT 5940 

TGTCIGCAAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCACCAA AAAAAAAAAA 6000 
AAAAAAA 

Seq ID NO: 419 Protein sequence 
Protein Accession 8i Bos sequence 

1 11 21 31 41 51 

I I I I I I 

KFOTKLTRTC AFADYRVILK TSQBDEtJ>VP DDISVHVMSS QSVLVSWVDP VLEKQKKWA 60 

SRQYTVRYRE KGELARWDYK QIANRHVLIE NLIPDTVYEF AVRISQGERD 6KMSTSVPQR 120 

TPESAPTTAP ENLNVHPVNG KPTWAASWD ALPETBOKVK VCLLDTGIiFS VSSFQFSAKS 180 

FQNTPFHTPR LSNHLEQSPS PILBTLLIjPW WMVCSLOIAI FSKSGPQTQE AWDLTPKPSL 240 

SLCQQECSCT QKDFSCLAYL IDIQTKQVNK DPQLEGSVPG PCFLPYFLTF MIiDIGGPSFI 300 

MCyBDPVSSL rOHSLKSVTVA SKADVOONTE DNGKPEKPBP SSPSPRAPAS SQHPSVPASP 360 

QGRKAXDLLIi DLKHKIIiANa GAPRKPQUIA KKAEEUJLQS TEITGEEELG SREDSPMSPS 420 

DTQOQKRTLR PPSBHQHSW AFGRTAVRAR MPAI.PRREGV DKPGPSIATQ PRPGAPPSAS 480 

ASPAHRASTQ GTSHRPSI>PA SLNDNDLVDS DEDEHAVGSL HPKGAFAQPR PALSPSRQSP 540 

SSVLRDRSSV HPGAKPASPA RRTPHSGAAE EDSSASAPPS RLSPPHGGSS RLUTQPHI.S 600 

SPLSKGGKDG EDAPATNSNA PSRSTMSSSV SSHI.SSRTQV SEGAEASDGE SHGDOSREDG 660 

QRQAEATAQT LHARPASGHF HLLRHKPFAA NGRSPSRFSI GRGPRLQPSS SFQSTVPSRA 720 

HPRVPSHSDS KPRLSSGIHO DEEDEKPLPA TWNDHVPSS SRQPISRGME DLSKSPOSOA 780 

SLHRitEPIPE NPKSTGADTH PQGKYSSIAS KAQDVQQSTD ADTEGHSPKA QPGSTDRBAS 840 

PARPFAARSQ OHPSVPRRMT PGRAPBQQPP PPVATS^HP GPQSROAGRS PSQPRI.SI.TQ 900 

AGRPRPTSQO RSHSSSDPYT ASSRGMLPXA LQNQDESAQG SYSDDSTEVB AQDVRAPAHA 960 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 



WO 02/086443 

ARAKEAAASL PKHQOVCSPT 
KSEPPSKSPIi SSKSQQSVSA 



GHFSTTPMI-S 
INGPQGTKWV 
L.FGQGRMGTP 
PTTATTMQPT 
CPPGTLERHD 



WPRYTTRAPP 
SNGKPNGQRI 
TPWSPDGLP 
TPLPTTTTPR 
TPKPTTPIPT 
DEDyEFBTSR 
PSAPC9I.TDA 
TPCTLVTOIfli 
SPSVSFVTES 



IiDHFQVDSIiD 
VYSASYBDFI 
DNPLLWRPP 



GAOAOGOHRS QRGHAASPAR PSRPGQPQSR ARVPSRAAPO 



PSYRQGyNGR 
OIPUlIiaGG 
EVIKKTTHPP 
TTTRRPTTTV 
BFSGI^TDTA 
FDIAGRKRFV 
ITWAVB6CR 
YFKVQ 



PNVEGKVLPG 
DGHTIVDLBO 
TTTMQPTTTT 



GIPECYAEED 
GAISSFPEEB 
DLPPQHAPRN 



OBBUSGSaSL SMMIPATRTA HDQNM 



VPTEEAYVIY 
APYVTYLNKD 
SFVIVDWDKA 
AQHPHOyGPI 



1020 
1080 
1140 
1200 
1260 
1320 
1380 

1500 
1560 
1620 



PCTAJS02/12476 



Seq ID NO I 420 DNA sequence 
Nucleic Acid Accession S: NM_022743 
Coding sequence: 128.. 1237 



I 

GTGGATTTTA 
TCTCTGCIGC 
AAAGCTGATG 
AAAAGCTTGG 
TCOTCCAGAC 
TTCAGAATCA 



CTGCTACCTO 
CTGCTTTGAA 
TG6TGATGAG 
GGCACACTGG 
TGAACGGCTT 
CrOCATCAAC 
ATACAGGATT 
CAAACT6CAO 
TQATATTATG 



OAGATACCTC 
AACACCCCTA 
CGATGCTCTC 
CCAGACCACA 
TCCX3TTCGAC 
GAGAAGCTTT 
AAAGAGGGCC 
GATGCCTCTC 
AACTCTTTCA 
ATCTCTTTGC 
CTCTTACTGC 



AGOGGGAATG 
TTCrrGGC3«3 
ACTCRTTTTA 
TCAGGCAACT 
AGCTGCCACC 
CCATCTGTAA 
TCAATCACRG 



GTQAACCTCT 

GGrrrocAAA 

ATTTGGTXGA 



TGTGACTQTT 
CAAGTATGGA 
AAGTGGGAGC 
CXX»ATATCA 
CTCGGCCTGT 
TTTTTCCCAG 
CTACATCAAG 
AGAGTGACAC 
TGCGAOGCCA 
CTTTGTTGAA 
CTTATTGGAA 
CCACAAOAAT 



TGACCAGTGA 
TCCGTTGCCA 
AGGAAGTTCA 
AG6TTCTGGC 
ACATCTACCA 
TGGAGGAAGC 
GAAGCCATCC 
GCATGTTTCC 
ATGGCAGAGA 



TGTATTGGCT 
GQCCAAATAC 
CAAATGCCTT 
AGTTGTCTTC 
TGATCTGGAG 
CGTAAT6ACA 
TGCCTTTGAC 
TGdGGASATI^ 
CT6TQACCCC 
AOACATCCSAG 



GCTGAAGGTG 
CTTGTTCTAT 
OGTCAGAGGG 
CCAAGCAATO 
ACACAGCCTG 
ATCCTAAGGO 
TGCCTTATTa AGGTCACACA 



TTTCTGGGCA 
TOTAGTGCTA 
AAAAGCTGCA 
AAACTTATGG 
TCAAATATTA 
TTTCAACATT 
CTTTTTGAAG 
CAGiSAAGTTG 
AACTGTTCGA 
GTGGGAGAGG 
AAGCAGCTGA 
AAGGATGCTG 
AAAAAAATTO 
GOGATCATAA 
CTCGACTGOG 
GGTACTCCSOA 
GTTCAA6TGA 
AAC3AATCTGA 
ATTGAAGATT 



AACKCAGATA 

ATGGAGCACC 

AC3UUICTGAC . 

TCATGAGAGA 

CCTTTGCAAA 

GTGTTGGCCT 

TTGTGTTCAA 

AGcrcaccAT 

GGGACCAGTA 
ATATGCTAAC 
AAGAACTGAA 



CATTAOTTOT AGAQAAGCAC 



CTCTATGCTT 
AGGTAAATAA 
QATTAXAATA 



CCATGGATQC 960 
CCATGGAOCC 1020 
TGAAAGTTGG 1080 
OACTGGCTTT 1140 
TGATTCTACr 1200 
GAGGGAAATA 1260 
TGTTAGCTGT 1330 
1380 
1440 



AATTCAAAAC 



CNSFTICHAE MQBVGVOLYF SISIOdlHSCD 
LOMIiMTSEER RKQLRDQVCP ECDCFRCQTQ 
WKWEBVLAMC QAIISSNSER LPDINIYQIiK 
IPPPGSHPVR GVQVMKVGKt Qt<HQ<a4FP<SA 



DSVRIiLGRW FKLKDGAPSB 
QDASQLPPAF DLFBAFAXVI 
HLUsRAVRDI EV6EEI<TICY 



Seq lO NO: 422 DHA sequence 
nucleic Acid Accession fi: NM_003014.2 
I 238.-648 



65 
70 
75 
80 



3 QCTGAOAGCT 



AAACTCTCCT 
GGCAGGAAGA 
TTCCTCTCCA 



GGCGCTGCTC 
GCTTTGCTGC 
CrCGGOQAAQ 



TCCTAGTGGC GCIGIGCCTG TGGCTGCACC 



QCOCCCIGCQ ACGCQGTGCS CAXCCCTATG TGCCGGCACSi 
ATGCOCAACC ACCT6(»CCA CACCACGCAQ S^^^^^ 
GAGGAGCTG6 
GCGCCCATTT 
CaACGCGCGC 
AGCCTGGCCT 
ATCGTCaCGG 



GCACCCTGGA 
GCGACGACTG 
GCX3A0GAGCT 



GTTCCTGCAC 
CGAGCCCCTC 
GCCTGTCTAT 
GGATGTTAAO 
TGACTGTAAA 
AACGTATCTC 



CGTTCAAQGA 
AAAAGATCCA 
AAGAAAACAG 
GCTCCCAAAC 
.AACCOGAAAA 



AGTCCTCATC 
GTCCACACAT 
TGATGCTTCT 
TACAGTGGGA 



CAGCCAGTCC 
GAGTGTGAGC 
QQCATTGCXrr 



CCTGCCCCaVT 
TGAAAATTGC 
AGAGAGGCTG 
CAGTCGTAGT 
CAAGAA6AAC 
TAACTASTTT 
GGQACAGCCT 



AATGAGGTCA 
CGAACTCAAG 
CAAGATGTTC 
TTAGTTGAAA 



i 

GTGCXXrrGTG 
CGACTGGAGT 
6GACAGCGAA 
CGCGAGAGGG 
TGG0GCT6G3 
TGCCCTGGAA 
TCCTGGCCAT 
TCTTCTTCTG 
AGCCXSTGCAA 
ACAACCACAG 
TGTGCATTTC 
TCACACCA6A 
CCGATCGGTG 
ACAfiCTATGT 
CAAOSGTGGT 
TCCCX3CTCAT 
TCATCATGTG 
AATGGAQAQA 
GGAGAACAGT 



I 



TTGGGGGAAG 
AGATGAGGGT 
CAGTGCCATG 
CXSTGOGOSGC 
CATCACGCGG 
OGAGCAGTAC 
TGCCATGTAC 
GTCGGTGTGC 
CTGGCCOQAA 



CAAGTGTAAA 
TATTCATQCC 
GGATGTAAAA 
TACAAATTCT 
TTACGAGTGG 
TCAQCTTAGT 



AAAOCCTCCT 
OA aOAgA ACA 
TTCCTTACAQ 
ATGIAAGGCC ATGTGOOXT TGCCCTAACR 



1020 
1080 
1140 
1200 
1260 
1320 
1380 



343 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



WO 02/086443 

ACTCACTGCA GTGCICTTCA 
GTTTTTCTTT GTAAGCCATC 
GAGTTAAAGC TGGTGGAAAA 
CTAGAAGAiGT AGGOAAAATA 
AAATGCCATA TTTCAAACAA 
TATCTGTTGT TGC3UVTGTTA 
AAGGAACAGT AGTGGAATGA 
TTTTTGTGAT QAAAGGGGAT 
TGTGTTTTTT TACX3U^TGAC 
AATAATAAAG AAflAATAAAT 
GTTACCTGAT TTCCATGATC 
ACAGTOAGTT TGTCTGTACC 
ATTTTATACC CACAAGAGAG 
AATAATTTGA CAAGCTTAAA 
TTAAATATTT TCTTTGCCTA 
AAAGTTQAGT TCCACCTCTG 
AAAAAGAACT TATTTGCAGC 
ATTTATTTTA AAAAACAATT 
AGGCATTCftA TAAATGCACA 
ACTACACAGA GGTAATCACT 
GCACTTATAA AATGATTTGA 
CTCCX^rCCTT TGCTTGGCCC 
TCTCATTTCT AACAGCTGTO 
TATTGGATAC TTAOGTOGTr 



TAGACACATC 
ACAAGCCATA 
GGCTTATTGC 
ATGCTTOTTA 
AACACGTAAT 
QTBATGTTTT 
ATGTTAAAAO 
TTTTTQAAAA 
TTCMTTTCT 
AAAAAG6AGA 
ATGATGCTTC 
ATTAGGAGTT 
GTATGTCACT 
AATGGCCTTC 
AATACATGTG 
AAATGAGAAT 
ATTTTATCAA 
TTATTGGCCT 
ACGCCCAAAG 



TTGCAGCATT 
GTGGTAGGTT 
ATTGCATTCA 
CAATTCQACC 



TTTCTTAAGG 
TGCCCTTTGG 
GAGTAACCTG 
TAATATGTGC 
TATGTTTTAT 
GAAAATATAA 



ATTAQA6AAG TAQCATATG6 



CTATGCTTCA 
TACAGAAGGT 
TGTQCATACT 
ATTGTAAAAT 
TACCTTTTGA 
TGTTTTTAAG 
TGCAGAAGGA 
AAAATTATAA 



1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 



PCT/US02/12476 



\ AAACAAAAAT 1920 



G6CAGAC91AT 
TTQTCAACAC 
AGGTACTAAT 
CATCTTACTT 
ATGTGAGTGC 
AGAGGAGTTA 
TACTTGACAG 
CAAATTTCAT 
TTTGCTAACA 
GAAATAAAAT 
TGGCATATTA 



ACAAATAAAA 
TTTATTGAGA 
TTATATTCCA 
TCTTCACTGA 



TAAGTTTTCC 
TAGTATGCAT 
CAATACTGAA 



AGAAACTTAA t 

GTCTGGATTC CTGTTTTTTG 19B0 

CCTCTTAAGC AGCACXAGAA 2040 

ATGCTCAAGT 2100 

CCAC CCTGAG 2160 

TTTTCTTCAT 2220 
2280 
2340 

AATTaiGOAC AATTGOAGGC 2400 

CAGTAAGCAT GTATTTTATA 2460 

CCTATCTAAT CCTACTCTCC 2520 

TTCTCCAGGT GTTTGCTTAT 2 580 
GTATACATGT GTTTCATAAC 
TGTCAAGAAA GCAGAAACCA 
TACTCAACAA ACTGTTGTGC 
TAAACATCTC ACOGGAATTC 



2640 
2700 
2760 



21 



31 



51 



I I I I I I 

MFLSILVALC LMUnJVLGVR GAPCEAVRIP KCRHMPWNIT RMPNHLHHST QENAILAIEQ 
YEELVI>VNCS AVLRFPFC3U11 YAPICTIiEPt. HDPIKPCKSV CQRARDOCEP UIKMmHSHP 
ESUICDELPV YDKGVCISPE AIVTDLPEDV KHIDITPDMM VQERPUWDC KRLSPORCKC 
XKVKPTLATY LSXHY8YVIH AKIKAVQRSG CNEVmWVDV KEIFKSSSPI PRTQVPLITH 
SSCOCPBZLP IIQavi.ZMC:yB HRSRMHLLBN CLVSiafllOQI. SKRSIQWEER tQEQSKTVQD 
iCKRTAGSXSR SNPPKPKGKP PAFKPASPKK NIKTHSAQRR TNPKRV 

Seq ID NO: 424 DNA sequence 
Nucleic Acid Accession »: BC010423 
Coding eequencei 248.. .1780 



3 GAAGCAGCrC TGSGGGAOCT CGQAGCTCCC G 



TCTGCAGCCS3 
TTCAACCATG 
GCTGCTACTG 
CGTGGTAACT 
CGGCGAGCAA 
ACTAGCXKHJA 
GGAGCAOCOQ 



GGGTGTGTAO 
AGGCAAGAAC 
GCTCCCAGGG 



P CCTGCCTTCT 



ACCTGGGACA 
GCTGCC6TCA 
ACTTGTGTGG 
GTGTCCTTCC 
GGCAGAGAAG 
TGGAC3VCGGC 



TCCTCAAGGG A 



GGCGCGGCTG CGGCTCCGAG 
ACTAGAAGAG 
CCCCAGCX3TG 
CTCCCGCTCT 
GCAGCCACTG 
CATCCTCCAC 
GTGGCACATT 
CTCATACAAC 
CACTTTGGGC 
CAATGACTTC 
CTCTOGGAAG 
ACTCrrGTTC TGCCTTCTGQ 
ATGACXXAGA 
CATTCaa.TC 
GGCCACCCTG 
GGCCGCAGTT 
TCTCCAGGCT 
ATGAACCATT 
TACATCAATQ 
CTCCTTCTQT 
ACA CCCCC AI TTCTTOCXaa 
AAOCCTTCTG 
CACTGTGTGT 
TQACTQTCCG 
AAGTGAACTG 
GTTTGGOGTG 
C3«3ACCX:CAG 



TGGGAGCCGA 
TTACAGGCCG 
GCX»GGACGC 
TGQCATGGGC 
AATACGGGCT 
GCAACCCCCT 
AOGAGTGCCG 
TGCTGGTGCC 
TGACCCTGGC 
CGGAGGTCAA 
CCTCAGAGTT 
TGTCXXATCC 
TTGCTGAGGC 
GAGCTATGCT 



GTGCXXXCCG 
AAAACTGCCC 
TCX3GGTGGAC 
TCATGTGAGC 



GGGTCAGTTC 
AAACGCTGGG 
CCTGAGGCCT 



GGTCAGCACC 

tcccx:tqccc 

AGCCTCX?rGC 



TGCTTCTACC 
GOiGGCGAAG 
CXXX3CTTA0G 
GTGCTCCTGC 



AGTGGAOACC 
CTTATTCAAG 
CAGTCTGCCT 
GGCTGCTGCT 
AGACCTCAGA 
GAQGGGACTC 



AGGGCCGCGT 540 

GC3UICX3CAGT 600 

GCAGCTTCCA 660 

TCMTTGAATC CTQGTCCRGC 720 

AC»GCrQAG6 GCAGCCCAGC 780 

AGQCACRACG TCCAflCCJGTT CCPTCAAGCA 840 



TGGCCTGCTC CAGGACC»AA OGATCACCCA 
CTCtGTGAGG GGCCTTGAAG ACCAAAATCT 
CAAGTGCCTG A6TGAAGGGC AGCCCCCTCC 
TCTGCCCAGT GGGGTACGAG TGGATGGGGA 
GCACAGCGGC ATCTAOCTCT GCCATGTCAG 
CACTGTGGAT 6TTCTTGACC C 



AATAXGAGQA 
ACAOGGACCC 
ATAGTCTCAA 
ACTCCA06CT 
CTGGGGGGaC CSAGGAOQAG 




GTGCATGTGT 
TGGAGGGGTG 
TGGTGTATGT 
TGTGTCATGT 
AGCAGTATTA 



TOACATGQGA 
AGATGCrCCC 
GGGCTCCACC 
GCCTGTOTGA 
ACTGTGTOCG 
GCCACXX3GAT 



GAAGATCAGG 
CTACGGGCXaV 
COCAGGCCTG 



TCTCCIAOCA C 



ATGATGCAGA 
TAGCIGGAGC 
ATGGGGGCAA 
AGCOCICTQC 
CATQCQCOOa 
TTTTTTCTTG 
AGAGmGAG 



CKTCCXACTS 
AATTGAerCT 
GTGTTGACTG 
TGQTGTGTAT 
TTGAGTGGTT 
GACCTCTQCC 
GGTTGGAGGA 
TGGAATCTGC 
GTGTGAA6CA 



TGATGAGTGA 
AAACACAGAC 
ATGAAGGCAT 
AGCCCAOGGG 
CCTCCCTTCC 
GCCT CCTTAA 
CTTTACCTCC 



TATGCTGTCA 
GC3GTXWGCAA 
TGAAAAAGCA 
GAGAGGTGGA 
CTCCGGTGTO 
GCCRGTCCCT 



T6TGGAGGG6 
TATCaUSAGTC 
CACTGTCAGG 
GGTATTTTCT 
GACTGTGGCT 
AGGGAACCTG 



OCTCTG O TGG CCTCTGGGCC TOCTGCATGT 



1860 
1920 
1980 
2040 
2100 



2340 
2400 
2460 



344 
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CTGTAAAAAA ACCAAAACCC AAAAAAAAAA AAAAAAAAAA 



Seq ID NO: 425 Protein sequence 
Protein Accession #: AAH10423 

1 11 21 31 41 51 

1 I I I I 1 

MPIiSLGAHW GPEANLUJ.I. LLASFTGRCP AGBI^SDW TWUSQCAKL PCTyRGDSGE 60 
QVGQVAWARV DA6EX»QEIiA LIiHSKYGUIV 8PAYBGRVEQ PPPPHNPLDG SVLLRNAVQA 120 

DBGBYEOIVS TPPAGSFQAH LRLRVLVPPL PSUfPGPALE EGQGLTLAAS CTAEGSPAPS 180 

VTHDTEVKGT TSSRSFKHSR SAAVTSEFHL VPSRSMKGQP LTCWSHPGL LQDQRITHIL 240 
HVSFLAEASV RGLEDQNLWH IGREGAMIiKC LSEGQPPPSY NUTRLDGPLP SOVRVDODTI. 300 
OFPPLTTEHS GIYVCHVSNB FSSRDSQVTV OVUIPQEDSG KQVDLVSASV VWGVIAALIi 360 
FCLLVWWI. MSRYKRRKAO QMTQBCyEBEL TLTREHSIRR LKSBHTDPRS QPEESVGUtA 420 
EGHPDSLKDN SSCSVMSBEP EOSSYSTLTT VREIBTQTEIj LSPGSGRAEE EEI3QDEGIXQ 480 
AHNHFVQSiG TLRAKPTGNG lYZNGRCaUiV 

Seq ID NO: 42fi OKA sequence 

MUcleic Acid Accession «: NM_003474.2 

coding sequence: 37.. 3036 

1 11 21 31 41 51 

I i I I I I 

CACTAACGCT CTTCCTAGTC CCOXSaCCfA CTCXSaAC3«3T TTGCTCATTT ATTGCAACGG 60 
TCAAG6CTGG CTTGTGCCAO AACGGCGCGC GCGCSAOGCA OGCACACACA OGGGGGQAAA 120 
CTTTTTTAAA AATGAAAGGC TAGAAGRGCT CAGOGGCGGC GCX3GGC08TG CGCGAGGGCT 180 
CCGGAGCTGA CTOGCCGAGG CAGGAAATCC CTCX33GTCGC OACGCCCGOC OCXMCTCGGC 240 
GCCCGCGTGG GATGGTGCAG CGCTCGCCGC CXSGGCCCQAG AGCTGCTGCA CTGAAQGCCG 300 
GCGACX.ATGG CAGOSCGCCC GCTGCCCGTG TCCCXXGCCC GCGCCCTCCT GCTCGCCCTG 360 
GCCGGTOCTC TGCTCXSCCSCC CTGCGAGGCC CGAGGGGTGA GCTTATGGAA CX5AAGGAAGA 420 
GCTGATGAAG TTGTCAQTGC CrCTGTTCGG AGTGGGGACC TCTGGATCKC AaTGAAOAGC 480 
TTCQACTCCA AGAATCATCC AGAAGTGCTG AATATTCGAC TACAACGGGA AAGCAAAGAA S40 
CTOATCATAA ATCTGGAAAG AAATGAAGGT CTCATTGCCA GCAGTTTCAC GGAAACCCAC 600 
TATCTGCAAG AOGGTACTGA TGTCTCCCTC GCTCGAAATT ACACGGTAAT TCTGGGTCAC 660 
IGTTACTACC ATOGACATOrr AOQGGGATAT TCTGATTCAG CAGTCAGTCT CAGCACXJTGT 720 
TCTGGTCTCA GGGGACTTAT TCTGTTTGAA AATQAAAGCT ATGTCTTAGA ACCAATGAAA 780 
AGTQCAACX:A ACAGATACAA ACTCTTCCCA GCGAAGAAGC TGAAAAGCQT CC3QGGGATCA 840 
TGTQQATCAC ATCACAACAC ACCAAACCTC GCTGCAAAGA ATGTGTTTCX: ACCACCCTCT 900 
CAGACATGGG CAAGAAGGCA TAAAAGAGAG ACCCTCAAGG CAACTAAGTA TGTGGAGCTG 960 

GTGATCCTGG CAGACAACCG AGAGTTTCAG AGGCAAGGAA AAGATCTG6A AAAAGTTAAG 1020 

CAGCGATTAA TAQftGATTGC TAATCACCTT GACAAGTTTT ACAGACCACT GAACATTCGG 1080 

ATCGTGTTGG TAGGCGTGGA AGTGTGGAAT GACATGGACA AATGCTCTGT AAGTCAGGAC 1140 

CCATTCACCA GCCTCCATGA ATTTCTGGAC TGGAGGAAGA TGAAGCTTCT ACCTCGCAAA 1200 

TCCCATGACA ATGOBCAGCT TGTCAGTGGG GTTTATTTCC AAGGGACCAC CATCX3GCATG 1260 

GCCCC3UVTCA TGAGCATGTG CACGGCAGAC CAGTCTGGGG GAATTGTCAT GQACCATTCA 1320 

GACAATCCCC TTGGTOCAGC GGTGACCCTG GCACATGAGC TGGGCCACAA TTTCGGGATG 1380 

AATCATGACA CACTGGACAG aOQCTOTAGC TGTCAAATQG CGGTTGAGAA AQGAGQCTGC 1440 

ATCATGAACG CTTCCACOGG GTACCCATTT CCCATGQTGT TCABCAGTTG CAQCAQGAAG IS 00 

GACTTGGAGA CCAGCCTGGA GAAA6GAATG GGGGTGTGCC TGTTTAACXrr GOCOGAAGTC 1560 

AGGGAGTCTT TCGGGGGCCA GAAGTGTGGG AACAGATTTQ TOQAAaAAGO AQAGGAGTGT 1620 

GACTGTGGGG AGCCAGAGGA ATGTAXGAAT OGCTGCTGCA ATGCCACCAC CTGTACCCTG 1680 

AAGCCGGACG CTGTGTGCGC ACATGGGCTG TGCTCTGAAG ACTGCCAGCT GAAGCCTGCA 1740 

GGAACAGCGT GCAGGGACTC CAGCAACTCC TGTGACCTCC CAGAGTTCTG CACAGGGGCC 1800 

AGCCCTCACT GCCCAGCCAA OSTGTACCTG CACQATGGGC ACTCATGTCA GGATGTGGAC 1860 

GGCTACTGCT ACAATGQCAT CTGCCAGACT CACGAGCAGC AGTGTGTCAC ACTCTGQGQA 1920 

CCAGGTGCTA AACCTGCCCC TGGGATCTGC TTTGAGAGAG TCAATTCTGC AGOTQATCCT 1980 

TATGGCAACT GTGGCAAAGT CTCGAAOAGT TCCTTTGCCR AAT60GA6AT GAGAGATGCT 2040 

AAATGTGQAA AAATCCAGTG TCAAGQAQGT GCXAGCOSGC CAQTCATTGG TACCAATGCC 2100 

GTTTCCATAG AAACAAACAT CCCCCTGCAG CAAGGAGGCC GGATTCTGTG CCX3GGGGACC 2160 

CACGTGTACT TGGGCGATQA CATGCCGGAC CCAGGGCTTQ TGCTTGCAGG CACAAAGTGT 2220 

GCAGATGGAA AAATCTGCCT GAATCGTCAA TGTCAAAATA TTAGTGTCTT T6GGGTTCAC 2280 

GAGTGTGCAA TGCAGT6CCA CGGCAGAGGG GTGTGCAACA ACAGGAAGAA CTGCCACTGC 2340 

GAGGCCCACT GGGCACCTCC CTTCTGTGAC AAGTTTGGCT TTGGAGGAAG CACAGACAGC 2400 

GGCCCCATCC GGCARGCAGA TAACCAAGGT TTAACCATAG GAATTCTGGT GACCATCCTG 2460 

TGTCTTCTTG CTQCGGQATT TQTG6TTTAT CTCAAAAGGA AGACCTTGAT AOGACIGCTG 2520 

TTTACAAATA AiSAAOACCAC CATTGAAAAA CTAAGGTGIO TGOGCCCTTC OCGGCCACCC 2580 

CGTGGCTTCC AACOCTOTCa GGCTCACCTC GGCCACXTTQ OAAAASaCCT OATGAGaAAQ 2640 

COGCCAGATT CCTACXTCACC GAAGGACAAT CCCAGGAGAT TGCTQCAGTG TCAGAATOTT 27O0 

GACATCAGCA GACCCCTCAA CX3GCCTGAAT GTCCCTCAGC CCCAGTOVAC TCAGCGAGTG 2760 

CTTCCTCCCC TCCACCGGGC CCCSVCX3TGCA CCTAGCGTCC CTGCCAGACC CCTGCCAGCC 2820 

AAGCCTGCAC TTAGGCAGGC CCAGGGGACC TGTAAGCCAA ACOXXXTTCA GAAGCXTCTG 2880 

CCTGCAGATC CTCTGGCCAG AACAACTCGQ CTCACTCATG CCTTGGCCAfl GACCCCAGGA 2940 

CAATGGGAGA CTGGGCTCOG CCTGGCACCC CTCAGACCTQ CTCCACAATA TCCACACCAA 3000 

OTGCCCAOAT CCACCX3VCAC CGCCTATATT AA6TOAQAAG CCOACACCTT TTTT CAACftG 3060 

TGAAQACAlQA AGTTZGCACT ATCTTTCAGC TCCAGTTGGA QTTTTTTOTA OCARCTTTTA 3120 

GQATTTTTTT TAAICnTAA AACATCATTA CIATAAGAAC TrTQAGCTAC TGC06TCAGT 3100 

aCTGTGCTGT GCTATGQTGC TCTOTCTACT TQCACAG6TA CTTOTAAATT ATTAATTTAT 3240 

GCAQAATQTT GATTACAGTG CAGTGCGCTG TAGTAGGCAT TTTTACEATC ACTGAGTrTT 3300 

CC3V.TQGCAGG AAGGCTTGTT GTGCTTTTAG TATTTTAGTG AACTTGAAAT ATCCTGCTTG 3360 

ATGGGATTCT GQACAGGATG TGTTTGCTTT CTGATCAAGQ CCrTATTGGA AAGCAGTCCC 3420 

CCAACTACCC CCAGCTGTGC TTATGGTACC AGATGCAGCT CRAGAGATCC CAAGTAGAAT 3480 

CTCAGTTGAT TTTCTGaATT CCCCATCTCA GGCCAGAGCC AAGGGGCTTC AGGTCCAGGC 3540 

TGTQTTTGGC TTTCAGGGAG GCCCTOTQCC CCTTGACAAC TOGCSUSOGAQ GCICCCAGGQ 3600 

ACACCTGGQA GAAATCTGQC TTCXGGCCAS GAAGCTTTGa TGAGAACCXQ QGTTOCSVGAC 3660 

KGGAATCTTA AGGTOTAGCC ACACCAGGAT AGA6ACIGGA ACACTAOACA AGCCAGAACT 3720 

TGACCCTGAG CT6ACCAGCC GTGAGCATGT TTGGAAGGGQ TCTGTAGTeT CACTCAAGGC 3780 

GGTGCTTGAT AGAAATGCCA AGCACTTCTT TTTCTCGCTG TOCrtTCTAG A6CACTGCCA 3840 



345 
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30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



WO 02/086443 

CCAGTAGGTT ATTTAGCTTG 
CTGCAAACCX3 CCACCTCCCT 
CAATGATCCr GTATTCAGAC 
TGAACXaVTTA ACCAGATCTA 
AATTAGGCAQ ACTCTTTATG 
TATAGTTCAT GTCTGCTATC 
CATCCTCTTT TTCCAACTTG 
ACCXATTTCT TAAACACTTG 
CAACTTGCTT ATCAACTTCX 
CTCTTCACTC TTCAAATGCC 
AAT6GCATGA GAAATACAAA 
CTQ6ACTGGT TTTCACATTA 
TTTATQAQAA AGCCTTCTTT 
TOTACX»AGA ATCTTGGTTT 
TCCCCACTGT ATCTAGGCAA 
AAACACACAC AAAAGQGAAC 
TTATTCTATA 6TTATTAAGT 
AGATACATAC AGAATTACTG 
TATATACTAT TAAAAAGGTT 
AOATGCCCAA ATCCTTAGAT 
ACCAAAAAAA AAAAAAAAAA 



PCT/US02/12476 



GGAAAG6TGG 
ATACTGCTTG 
AGATGAGGAC 
GTCAATCAAG 
CTTGCAAAAA 
ATTATTCGTA 
GCTGCAGGAA 
CAACCTACCT 
TAAATATTAT 
TGACTAGGGA 



TGTTTCTGTA 
GAGCTGAGCA 
TTTCCATGGG 
TCTGTTTACT 
CTACAACCAA 
GATATTQGAC 



GAAGACAATT 
TGGGGTCAAC 
GCXITTCCAaA 
CATAGTATTC 
CCAGCTCTAA 
TCTTTAAAAT 
TAACTGATTA 



GTTGAGCATC 
GAGATGTOGC 
GCCATGTTTC 
TAAGGTAAAA 
GACAACAGTT 
AGTTTTCCrA 



ATGACTATGG 
TACATTCCAA 
GTAAAGCCAT 



AGAAACCTAC 
AATCACCACA 
ACCACAACTA 
GCAAGGTTCA 
rTGGAATGTGA 
AAAGAACCrr 
ATGCTTTTAA 
ACAGAATGTG 
TTGGGCAGCA 
ACAAGGTCTT 
TGCCATGATG 
ACATAATTCA 
TGCTTTGAAA 
GCATTTCACT 
ATAAACTAAA 
CTCCrrATAOC 
GCTG6AAAAT 
TTQTACTAAA 



TOTTCATGGG 
CTCIATCGGG 
CAGAGTCTGA 
ATAAG6AAAT 
TCCCCTTGAA 
TAAAGTOACT 
CCTCTGTCTT 
CTCTGAGTGT 
CAGAAAAATA 
TTCCCGGTGT 



ATGCATCTGT 
AATACTGCIG 
GGCAAACATA 



4200 
4260 
4320 
4380 



4620 
4680 
4740 
4800 
4860 
4920 



CTGGCATGTT t 



r CCAATTATAA GAGGATATGA 



41 



MAARPIiPVSP ARAIiLLAIiAG ALLAPCEARG VSIMNBGRAD EVVSASVRSG DIiWIFVKSFD 
SKNHPBVLNI RLQRESKELI INLERNEGLI ASSFTETHVL QDOTDVSLAR NYTVILGHCY 
YHQHVRGYSD SAVSLSTCSG LRGLIVPENB SYVliBPMKSA TNRYKUPAK KIiKSVRGSCG 
SHHNTPNIJ^A KNVFPPPSQT WARRHKRBTL KATKYVBLVI VADNREFCIRQ GKDLBKVKQR 
tlEIANHVDK FYRPtNIRIV bVGVBVWNDM DKCSVSQDPF TStHEFLDWR KMKLLPRKSH 
DNAQLVSGVY PQGTTIGMAP IMSHCTADQS GGIVMDHSDM PLOAAVTLAH BI/aniFGMIlH 
DTUSRGCSCQ MAVEKGGCIM NASTOYPPPM VFSSCSRXDL BTSIiEKGMGV CLFNIi?EVRE 



ACRDSSNSCX 
AKPAPGICFE 
ISTNIPUX3G 



LPEPCTGASP 
RVNSACDPYG HCGKVSKSSF 
GRILCRGTHV YLGDDMPDPG 
NNRKNCHCEA HWAPPFOSKF 
RKTl.IRIiI.FT NKKTTIEKLR 
RLLQCQNVDl SRPLNGLNVP 
PNPPQKPLPA DPIiASTXRLT 



LAAOFWYIiK 
DSYPPKDNPR 
ALRQAQGTCK 
RSTHTAYIK 



Seq ID NO: 428 DNA sequence 
Nucleic Acid Acceselon #: nm 
Coding sequence: 135.. 1043 



AKCEMROAKC 
LVLAGTKCAD 
GFGGSTDSGP 
CVRPSRPPRG 
QPQSTQRVLP 
HALARTPGQW 



CYNGICQTHE 
GKIQCQGGAS 
GKICI.NRQCQ 
IRQADHQGLT 
FQPCQAHLQH 
PLHRAPRAPS 
BTGLRUVPUt 



OQCVTI.WGPG 
RPVIGTNAVS 
HISVFGVHEC 
IGILVTILCL 
LGKGU4RKPP 
VPARPI.PAKP 
PAPQYPHQVP 



3 GAGGAGGGGA AGCGGCGAAG 



TGGCCACCTT 
ACAGGAGCTC 
GTTTGQTCAA 
GTGAGATTCO 
ATGCCCAGGG 



CCAGCAGAAA 
CGCTGGCGAT 
GGGCTTACAT 
CAAGTCATTC 
CTGCATAAGC 
CTACCT CAAQ 
GATCCKXTTC 
ACTTQCTOCr GACCTGTGGQ QAGGAGGTGA 



GGCCGCCTGT 
GTGGGGTGTG 
GGGATTTGCA 
ATCAAAGACG 



CATGACCXTTG 
CCCACCOQAG 
TACAGCG6AG 
ATGTTTOGAG 
GCACAACGCT 



AQCGGGSAATG 



GGAAATGGTG 
CCAGGAGAAC 
ACCCTACGTO 
CACCCACAGC 
CTTCTGCACC 



GCTTTGGTGT 
GGTCCCCAAG 
ATCCAGCACT 
AACAACTCTT 
GGAAAATTTG 
GCTCTGCGGC 
TCCCaGTTGC 
ACECGGGTGA 
GACCTCGTGA 
GTGC3VGGTTC 
TCGGCCATCC 
AAGCTCTCCA 



GGGCCCACCA CGGGGAAGCA 



TGGGGCTCAG 



TTATCTATGG 



TCGGGGGCCT 
AQTATTCTOA 
GTCCATTTTC 
ACGCABOATT 
TGAQATGQAQ ACCCCTGGGG 
ACGTACTCAA GGGAGCGCGC 
AGTCAGTGGQ TGTCGGCCGC 
GCAGOGCCCC CAGAQCTOGG 
CTGGTGCTGT 
ATAAATATCG 
TGGTGCCAAA 



OCCACCEAAA 
GAAGCAGC6A 
GGCCTGGCCA 



CGAQGCAOAG 960 



QAATGTAAAA 
AG GGGG TGCT 
CTCTTGGCGA 



ACTGCTTCAA 
TCTAAATAAA 
TAAAAAAAAA 
TTAAAAGCTA 
CACTTGGGGG 
TTTCCCTTAG 
CTGGCGATTC 
GAAAGAAGAG 



ATCTCGATTT 
TGGCTTTCAA 
AAAACCAGCC 
TCAAACAGCQ 
AAACCTTATA 
QATTTCGTTA 
CAGGAGACCC 
AATGAA6ACT 



ACX3CCACCAA 
CCCTGCAQAA 
GCX3TGTTTGA 
TGACTTTTCT 
CCTTGAAATG 
CGGCX»TCAG 
GOSCGGCTGC 
TGCTOCACGA 
AGGAQGCCAT 
CCATCTTGAG 
AGCCCCAGGT 

GOACATCACC 
GGTAGCAAQA 
GGACCTTCCG 
TGAAATQAAA 
ACATTCCAAA 
TGTGQACTTC 
CCGTGGGOTC 
CCGCGTTATC 
TCTGTTGTGG 
CCACACAGTG 
CTCOGCGGAA 

CTTAGAATGC AGGAGAAGGG TGGAGAGORQ GCRGGOaCOQ 
TGGGGCCTTG 03GTTCAGAG 
TRATTTCTGA GCCATTGTAC 
TATCAGTTTA TATTTTAACC 
TTATATCTAC ATATCTGTCA 
AAACCAGCTC AAAGGGGGTT 
•iVTTTlTITT AAGTTCTATT 
GCCTGACATG GACTCCTGCC 
TGOGOAGTAC ATTTGACAAA 
AAGCTOCSTC 
CKrCTGAGGQ GAKJGOAAAa 
AAATGCI6AC CTTTTACATA 



ATCGAGGTGT 
TCAGGGGTGC 
CTCQTACCTT 
GGGACGTGAA 
GGTGCTGGGC 



CQCCCATGCC 
GrGGGAAOAC 
CGAAATCTTT 
TTAGAGAGGG 
GTGTTCGCX3G 
CrGGTGAATT 
TGTCTTCTTT 



QAACAGTCTG 

cxrrccAOGCc 

GGGATGTCAC 
AACGGRCAGG 
CTGCACTTAC 
CCATCTGTGG 



CACTTTTTTT 
ACAAAGCAAC 
CATCCTTTGA 
ACATAGCCAT 
CCCAGAGGAA 
TCTCyvCCTTO 
AOCTGGAAAC 
ACTTAGXAAT 



TTTCTATGTG 
GAGTGGCCCC 
ATTTATCCAG 
TGGQTCATTA 
GGCTGATTTT 
AaiXCTGACT 
AATACACACC 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 



1980 
2040 
2100 
2160 
2220 



346 



wo 02/086443 

AAATCAASGA GACTGCTQAA AATCTCTAAQ OGACAGGATT TTCXMATCC TAATTCGAAA 2280 
TTTAGC3ATA AGGAGAGGAG TCCAAGGGGA CAAATAAAGG CAGAGAOAQA GAGAGAGAGA 2340 
GGGAGAGGAA GAAAAGAGAG AGAGAAAAGA GCCTCGTCCC 



PCTAJS02/12476 



11 



21 



31 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



III 

MCAEHLGQFM TLALVLATFD PARGTDATNP PEGPQDRSSQ QKGRLSLQNT AEIQHCLVNA 
GOV6C6VFEC FENHSCEIRG LHGICMTFLH NAGKFDAQGK SFIKDAIJCCK AHALSURFGC 
ISKKCPAIRE MVSQMJKECY LKHDI.CAAAQ ENTRVIVEMI HFKDI.LLHEP YVDLVNLI,LT 
CGEEVKEAIT HSVQVQCEQN WGSI1CSII.SP CTSAIQKPPT APPERQPQVD RTKLSRAKHG 
EAGHHI.PBPS SRETGRGAK6 ERGSKSKPNA HAR6RVG6I/3 AQGPSGSSEH EDBQSEYSDI 



Seq ID NO: 430 DKA se<iuence 

Nucleic Acid Accession 9: NM_005940 

Coding sequence: 23.. 1489 



21 
1 

GGATGGCTCC 
TGCTGCTGCT 
ACCTCCATGC 
CACCTGCCCC 
GTGGOGTGCC 

AGTTGGTGCA 
TGACGCCACT 
CCAGGTACTG 
CCTTCTTCCC 
CTATCGGG6A 



AAGCCCAGCA GCCCXXX3GGC 



AGCCCTGCCC AGTA6CCCGG 
CAGCCTCAGG CCTCCCCX3CT 
CCOACAGAAQ AGQTTCGTGC 



CCTAAAGGTA TGGAGCGATG 
TGACATCATG ATOQACTTGG 
TGGGGGCATC CTGGCCCATG 
C6ACTAXGAT GAGACCTGGA 
AGCCCATGAA TTTQGCCACO 
GTCCGCCTTC TACACCTTTC 
TCAACACCTA TATGGCCAGC 
CCAGGCTGGG ATAGACACCA 
CTOTGAGGCC TCCTTTGAC33 
S TGGCQCCTCC 



CATTTGGTTC TTCCAAGGT6 



CCTCTACTGG 
GGGTCCTGAC 
ATGCCCTCAG 
ATCTTTGTGG 



AGCGACTGTC 



GGGTGCTGAC 
CTCTGGGCAC 
AACCACCATG 
TCAGACT6GG 



CCTGQCCCAC 
ATGAGATTGC 
CGQTCTCCAC 
GTQGQGGCCA 
CCAGCCCIGT 
CTCAGTACTG 



CTGTGAAGGT 
GTGCCGAC3CC 
CCCTGCCAGG 
CAGGCATGGG 
ACAACTGCCG 



CGAGAGGAGO QGGCCACAQC CCTGOCATGC 180 
TOCCACGCAO GAAGOCCCXX: 
CGACX:CATCT GATGGGCTGA 
GCGCTGGGAG AAGACGOACC 
GGAGCAGGTG GGGCAGACGA 
CACCTTTACT GAGGTGCACX5 
GCATGGGGAC GACCTGCXX3T 
CAAGACTCAC CGAGAAGGGQ 
TGACCAGGGC ACAGACCOXSC 



GAGTCTCA6C C 
TGTCACCTCC ? 
ACOGCTGGAG C 
CATCCGAGGC C 



CT6CCCGCAA 300 

tCACCTACRG 360 

TGGCAGAGGC 420 

AGGGCCGTGC 480 

TTGATGGGCC 540 

ATGTCCACTT fiOO 



GGACGCTGCC 
GGTGTACGAC 
GAGGTTCCCG 



GAAGGCTCTG 
TGCCAACACT 
CCACGAATAT 
ACTGAGCCCA 
GGAGGGCCAC 



CCCTQGGCCC 
CX3CCAGATGC 
TCTTCAAAGC 
CATTGGCCTC 
CCCAQGGCXA 



GGCTACCCAG 
TTCGAGGATG 
GGTGAAAAGC 
GTCCATGCTG 



GAAGGCTTCC CCOGTCTCGT 
TTCCTCTGAC CATGGCTTGG 
CAGGCTAQAG ACCCATGOCC 



GCAGGTCX3T6 



TCCTTCCRGG 
TGAGCAACTG 
ATCTGTCTGC 
GTTCACRGTC 
CAACATACCT 
ATCCTCCAAA 
TTTTTAAACT 



GGCTGGCACT 
GGCTGTAGGG 
CTTCTGGCTG 
AAATGGGGAG 
CAATCCTGTC 
GCCATTGTAA 
GAGGATTGTC 



GAAGCAAGGG 
CAGGGCCACT 
AC3UVTCCTGG 
GGGTATTCTT 
CCAOGCCQGA 
ATGTGTGTAC 
ATTAAACACA 



AAATCTGTTC 
CATGCAGGAG 
TCCTCCTGAA 
A6T6TSIATA 



AGGTCTTGOT 
TCCAGAATCC 
ACXXCAGGCC 



AACCTTCTTC T 



GTCaCCTGCC 
GGGCAGTCTT 
GTCX:CTCAGG 
GTTGTGAGGT 

cagccctggc 
aggtgcx:tgc 
aggccaaaaa 

CTGGAGGCTG 
CAGCACTGCT 



11 



I 

MAPAAWIJISA 
PAFATQEAPR 
LVQEQVRQTM 
FFPKTHREGD 
YPLSLSPDDC 
VSTIRGBLFF 
QVWVYDGEKP 
PVPRRATDWR 
AEPANTFL 



Seq ID NO: 432 ONA sequence 
Nucleic Acid Accession #: NM_024022 
- ■■ 202.. 1563 



AARAUiPPML LLIjLQPPPt>L 
PASSLRPPRC 
ABAIKVWSDV 
VHFDYDETWT 
RGVQHIiYGQP 
FKAGFVWRLR 
VLGPAPLTEI> 
GVPSBIDAAF 



R GGQLQPOyPA 
[> GIiVRFPVHAA 
F QDAOGYAYPL 



AKALPPOVHH 
ARNRQKRFVL 
GRADIMIDFA 
QVAAHBFGHV 
LGPQAGIDTN 
LASRHWQGLP 



41 

I 

UIABRRGPQP 
SGORWEKTDL 
RYWHGDDLPF 
U3LQHTTAAK 
&IAPLBPDAP 
SPVDAAFEDA 



11 



21 



51 



2040 
2100 
2160 
2220 



51 
I 

WHAAI.PSSPA 
TVRILRFPWQ 
DGPGGILAHA 
ALMSAPYTFR 
PDACEASFDA 
QGHIWFFQGA 



RGRLYWKFDP VEVKAI.BOFP RI.V6POFFGC 



I I I t I I 

ACCXSGGCACC GQACGGCTOG G6TACTTTCG TTCTXAATTA GGTCATGCCC GTGTGAGCCA 
G6AAAGGGCT UWl ' l ' fA TGG GAAGOCMSTA ACACTGTGGC CTACTATCTC TTCaSTGOTG 
CXZATCTACAT TTTTGQGACT CGGOAATTAT GAG6TAGAQG TG6ASGOGGA GCOGGATGTC 
AGA6GTCCTG AAATAOTCAC CATGQGQQAA AATGAXCCOC CTGCTCTTGA AGCCCCCTTC 
TCATTOCOAT OGCTTrriGG CCTTGATGAT TTGAAAATAA GTCCTGTTGC ACC»QATGCA 
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15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CCTGTCACTQ 
ATTAGCACTG 
CTCATCCTTT 
CXKK3GAGGAC 
CRCAGCTGCT 



CTGCCATTOA 
GCCATTGGTC 
AAGTGTATCG 



OCIG6A6GGG C 



CCTGTGCGGG 
TGACTTGTAC 
TCCAGCCCCA 
GCTGGGCAAT 
CCAGCCTGTG 
6TCAGGATCQ 
CXJTCCCTTTQ 



TGCAt3«3CCT 
TTGCTCTOQC 
GGCTCTGTCA 
CTCCCCAAGT 
TCCCACTTGG 



TGCCTGCCCA 



WO 02/086443 

GATGCTGTTG CTGCRCAGAT 
ATTOGQATCA TTGCATTGAT 
TCAOOGAAQT ACAGATGTCG 
GGAGTCTCGG ATTGCAAAGA 
AATGCGGTGC TCCRGGTGTT 
AAGOGTCACT A06CAAATGT 
6ATAACCTCA GAOTGAGCTC 
CACCTCTTGC CAGATGACAA 
TGTGCXrPCTG GCCACGTQST, 
AGCTCACGCA TCGTGGGTGG 
CTTCAGTTCC AGGGCTACCA 
ACTOCTGCAC ACTGTGTTTA 
CTAGTTTCCC TGTTGGACAA 
AGCRAGTACA AGCCAAAGAG 
CTCAOGTTCA ATQAAATQAT 
6ATGGAAAA6 TGTGCT6GAC 
CCTQTCCTGA ACCACGOGGC 
GTGTAOGGTa GCATCATCTC 
GACTIGCTGCC AGGGGGACAO 
TTAGTGGGAG CxyiCCAGCTT 
ACCXSTGTCA CCTCCn-rCCT 
TGAAGAGGAA GGGGACAAGT AGCCACCTGA GTTCCTGAGG 
TCCCCTGGAC TCCOGTOTAQ GAACCTGCAC ACX3AGC3U3AC 
CXSGCACCAGT ASCAGGC COO AAAGAG GCAC CCTTOCATCT 
GCTQCTTTTT GTTTTTTGTT TTTTTGAGGT GGAGTCTCGC 
GCAGTGGCGA AATCCCTGCT CACTGCRQCC TCCGCTTCCC 
CCTCAGCTTC 
TATTrTTAGT 
CAAATGAT6T 
CCTAGCCTCA 
GCGGCCTTTC 
ACATAAGCAG 
CCAGCCCAGA 
AACCCAOCCT 
ACTCGTTTAA 
TCCTTTGATT 
AAAAA 



AGTTTTTTCC 



AGCTGATAGC 
GTGTCCGGGT 
CCATGTGCTC 
TCCX»AGCTA 



AATCATCX3TC 
CTTCGACTGC 
TCGATGTGAC 
GGGTGGTCAG 
CGATGACTGG 
TGTGAGTTCA 



PCT/US02/12476 



CAGTATATGT 
GTGGTCATAG 
AGTGGCCCTG 
TCACGCCCCT 
CATGGACCAT 
TGGAGAAOAT 
TTATQAAlGCr 
ACTCTGAAGA 



mar otccatcgat no 



GTGGATCATC 
CCAGGTGGQT 
TGTCTACCAC 



ATTTCCAACA 
CTCTGCGCGG 
CTGaTOTGTC 



GCCTGCTTCav 
CGCTCCTTTC 
CCACTGGTCC 
TTATGTGACC 



TTCTACTTCC 
GGCCTATTTT 
CCAAATAATA 



TTTCACCATG 
GCCTCCCACA 
TGATC TTCAC 
ATCTGGTTTT 
TCACOTGCAA 
TGCAGTCACT 
AAGACTTATT 



GCTACCTGAC 
AAGAGAGGAG 
TGAACAAGCC 
TGGAGAGAGA 
TGATGAAGAC 
ACCCTTGGAG 
OATTCCAGCA 
TCTSTTGCCC 
TGGTTCAAGC 
CCAC3VCCCAA 
TGCTCTCSUkA 
TACAGGCATG 
AGAAGCAGCA 
GTCTTGCAAA 



QCTGTGGAAG 
TGGGGTGTAC 
CCTAAAAACC 
AGCCCGATCC 
CTCTGAGTTC 
CAACCTTCAA 



GTGCTGGGAT 
TAAGAACAAA 
CTCTCCAGGG 
AGCCACCAAC 

GCACGTTTTC ATCTCTAGGG 
TTCACATGTG 6GGAGGTTAA TCTAGGAATG 



CTAATTTTTG 
CCCCTGACCT 
GGCCACCACG 
ACTTQCAAOa 
ATTCCTGACG 
AAAAGACGCA 
ACCAQAACCA 



TGTTTCCTTC CCTCAAAAAA J 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



2i00 
2160 
2220 
2280 
2340 
2400 
2460 



21 



31 



41 



51 



1 1 I 1 I I 

MQENDPPAVE APFSFRSLFG LDDLKISPVA PDADAVAAOI LSLLPLKFPP IIVIGIIALI 
LAIAIGUSIH FOCSGKKSCR SSFKCIEtilA RCDGVSrOtt) GEDETfRCVHV GGONAVLCVF 
TAASWKTMCS DDHKGHYAIJV ACAQLGFPSY VSSDNLRVSS LBGQFREEPV SIDHI.LPDDK 
VTALHHSVYV REGCASGHW TLQCTACGHR RGYSSRIVGG NMSIiLSQWPW QASLOFOGXH 
LCX3GSVITPI. WIITAAHCVY DLYLPKSWTI QVGLVSliLDN PAPSHLVEKI VYHSKVKPKR 
LGNDIALMKL AQPLTPNEMI QPVCLPNSBB NFPDGKVCWT 6GWGATEDGG DASPVLNHAA 
VPLISNKICN HRDVYGGIIS PSMLCAGYLT GGVDSCQGDS GOPLVOQEHS LWKLVOATSP 
GIGCAEVNKP GVYTRVTSFIi DWIHE(91B!iO LKT 

Seq ID NO I 434 DNA sequence 

Nucleic Acid Accession #i NM_000493.2 

Coding sequence: 97. .2139 



I 

CACXTTCTGC 
CCAGGAACTC 
CTGCTAGTAT 
ACAGGCATAA 
AAGAGTAAAG 



11 
1 

ACTGCTCATC 
CCAGCACGCA 
CCTTGAACTT 
AAGGCCCACr 
6TATAGCAGT 



GTTGQACCAQ 
CCXSGCTGGAA 
CCCAGGGGCT 
GGGGAAATGG 
GGTCCCACAG 



31 
I 

AAGCTTCAGA 
GAQAATATGC 
GTOTTTTACQ 
AAGACACAGT 
CAAGGTACTC 



r 6GACAACAGO 



TGCCACAAAT 
CTGAAOGATA 
TCTTCATTCC 



ACCCTTTTTG 
CCAAATGCCC 
CTACACCATA 
AGGCCCTGCT 
CGGAAGTCCT 



CCAGGTCCCC 



GATATGGTGC 
GACCATCTGG 
GCATCAAAGG 
AA6GCCCTCC 



TGATAGAGGT 
TGGGGAAOGA 
AGGGATTCCA 



OACCACCIGO 
GACCrAOMSG 
CTGGTATGRA 
GGGGTCTTCC 
GAGGTGAAAA 
AAATOGGACC 
GCATTGGAAA 
GTCTOCCTGG 



TGGACAGAAA 
AGGCXSrrCAG 
TGGGGTTCCA 
AATTGGCCCA 
GCC3M3GAGCT 
GGCTCCAGGA 960 



ATAOCIGGOC CCCCAGQGCC- TCCTGGCTTT QGOAAACCAO GCTTGCCAGG CCTGAAGGGA 



GGTCTTCCTQ GGAAOCCAGa TCTGACTGGA CCCCCTGGGA ATATGGGACC CCAAOGACCA 
AAAGGCATCC CGGGTAGCCA TGGTCTCCCA GGCCCTAAAG GTGAGACAGO GCCAGCTGGG 
CCK3CAGGAT ACCCTGGGGC TAAGGGTGAA AGGGGTTCCC CTGGGTCAGA TGGAAAACCA 
GGGTACCCAG GAAAACCAGG TCTCGATGGT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 
AAAGOTOATC CTGGAGTTGG AGGACCTCCT GGTCTCCCAG GCCXrTGTGGG CCCAGCAGGA 
GCAAAGGGAA TQCCOQQACA CAAXGGAGAG GCIGGCOCAA GAGGTGCCXIC TGGAAXACCA 
GGTACTAOAG GCCCTATTGG 6CCACCAGGC ATTCCAGQAT TCCCTGGGTC TAAAGGGQAT 
CCAGOAAOTC CXSOOTCCICC TO6CCCAOCT GGCATAGCAA CTAAOGGOCT CftAlXSQACCC 
ACCGOGCCAC CSUiGGCCTCX: AGOTCCAAGA GGCCACTCTG GaGASC CTOO TCTTOCAGGQ 



GGCXaVAAGGC OaQTCTTTC TGOdACCCCT CTTQTTAQTG CCAACCAG6Q GGTAACAGGA 



1020 
1080 
1140 

1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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ATGCCTGTOT CTGCTTTTAC TGTTATTCTC TOSUUWSCTT ACCCAGCAAT AGGAACTCCC 1800 

ATACCATTTG ATAAAATTTT GTATAACAGG CAACAGCATT ATGACCX»AG GACTGGAATC 1860 

TTTACTTCTC ASATACCAGQ AATATACTAT TTTTCATACC AC6TGCATGT GAAAGGQACT 1920 

CATGTTTCGG TAGGCCTGTA TAA6AATGGC ACCCCTGTAA TGTRCACCTA TGATGAATAC 1980 

ACCaAAGGCT ACXTGGATCA GGCTTCAGGG AGTGCCATCA TCGATCTCAC AGAAAATGAC 2040 

CAGGTGtGGC TCCAGCTTCC OUITGOOQAG TCAAATGQCC TATACTOCTC TGAGTATGTC 2100 

CaCTCCTCTT TCTCAGGATT CCTAGTGGCT CCAATaTQAa TACACCCCAC AOAGCTAATC 2160 

TAAATCTTGT OCTAOAAAAA «3CaVTTCTCTA ACTCTACtXX: ACCXTTACAAA ATGCATA3GQ 2220 

AGGTAGGCTO AAAAGAATOT AATTTTTATT TTCTQAAATA CAGATTTGAG CTATCAGACC 2280 

AACAAACCTT CCCCCTGAAA AGTGAGCAGC AACX3TAAAAA CGTATGTGAA GCCTCTCTTG 2340 

AATTTCTAOT TAGCAATCTT AAGGCTCTTT AAGOTTTTCT CCAATATTAA AAAATATCAC 2400 

CAAAGAAGTC CTGCTATGTT AAAAACAAAC AACAAAAAAC AAAGCAACAA AAAAAAAAAT 24 60 

TAAAAAAAAA AAC3«3AAATA GAGCTCTAAG TTATGTGAAA TTTGATTTGA QAAACTOGGC 2520 

ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC TATGAATATG AGAACTTCTA GGAAACATCC 2580 

AGQAGGTATC ATATAACTTT GTAGAACTTA AATACTTGAA TATTCRAATT TAAAA6ACAC 2640 

TGTATCCCXaC AAAATATTTC TCATGQIGCA CIACTCTGA6 GCXTTOTATGa CCCCTTTCRT 2700 

CAATATCTAT TCAAATATAC AGGTGCaTAT ATACOTeTTA AAGCTCTTAT ATAAAAAAGC 2760 

CCCAAAATAT TGAAGTTCAT CTGAAATGCA AQGTGCTTTC ATCAA3GAAC CTTTTCAAAA 2820 

CTTTTCTATG ATTGCaGAGA AGCTTTTTAT ATACCCAGCA TAAC TTGG AA ACAGGTATCT 2880 

GACCTATTCT TATTTAGTTA ACACAAGTGT GATTAATTTG ATTTCTTTAA TTCCTTATTG 2940 

AATCTTATGT GATATGATTT TCTGGATTTA CAGAACATTA GCAC»TGTAC CTTGTGCCTC 3000 

OaTTCRAGT GAAGTTATAA TTTACACTGA GGGTTTCAAA ATTCJGACTAG AAGTGGAGAT 3060 

ATATTATTTA TTPATGCACT GTACTGTATT TTTATATTGC TGTTTAAAAC TTTTAAGCTG 3120 

TQCCTCACTT ATTAAAGCAC AAAATGTTTT ACCTACTCCT TATTTACGAC ACAATAAAAT 3180 

AACATCAATA QATTTTTAGG CTGAATTAAT TTGAAAGCAQ CAATTTGCTQ TTCTCAACX» 3240 

TTcrrrc&AS gcttttcatt cgacacaata aaataacatc aatag 



I I I I- I I 

MLPQIPFUiIa VSUn.VKGVF YAERYQMPTG IKGPI»PirrKT QFFIPYTIK3 KOIAVRGEQG 
TPGPFGPA6P RGHEGPSQPP GKPGYGSPSI* QGBPGIiPGPP GPSAVGKFGV PGIiPGKPGER 
GPyGPKQDVG PAGIiPGPRGP PGPPOIPGPA GISVPGKPGQ QGPTGAPGPR GFEGEKGAPG 
VPGMNGQKGB MGYGAPGRPG ERGLPGPQGP TGPSGPPGVQ XHGENGVPGQ PGIKGDRGPP 
GEMGPIGPPG PQGPPGERGP BGIGKPGAAG APGQPGIPGT KGLPGAPGIA GPPGPPGHGK 
PGLPGUCGER GPAGLPGGPG AKGEQGPAGL PGKPGbTGPP GNMQPQGPKG IPGSBGI.BGP 
K0ETGPAGPA GYPGAKGERG SPGSDGKPGY PGKPGIiDGPK GNPGIiPGPKG DPGVGGPPGL 
PGPVGPAGAK GMPGHNGEAQ PRGAPGIPGT RQPIGPPGIP GFPGSKGDPG SPGPPGPAGI 
ATKGIiHGPTG PPGPPGPRGH SGEPGLPGPP GPPGPPGQAV MPEGPIKAGQ RPSI«JTPLV 
SAMQGVTGMP VSAFTVILSK AYPAIGTPIP FDKILTORQQ HYDPRTGIPT CQIPGIYYFS 
YHVHVKGTHV WVGI.VXNGTP VMVTXDEYTK GYLDQASGSA IIDLTENDQV MLQIiPNAESN 



Seq ID NO: 436 DNA sequence 
Nucleic Acid Accession XM_062811 
Coding sequence: 1..888 

1 11 21 31 41 SI 

ATGTGGGGCG CTCGCCBCTC OTCOGTCTCC TCATCCTGGA ACX3CCGCTTC GCTCCTGCAG 60 

CTGCTGCTGG CTGCGCTGCT GGCGGCGQGG GCGAGGGCCA GCX3GCGAQTA CTGCCACGGC 120 

TGGCTQGACG CGCAGGGCGT CTGGCGCATC GGCTTCXAGT GTCCCGAGCG CTTCGACGGC 180 

GGCGACX3CCA CCATCTGCTG CSK3CAGCTGC GCGrTGOGCT ACTGCTGCTC CAGCX3CCGAG 240 

GCX3CGCCrrGG ACCAGGGOBO CIGCXSACAAT GAOC6CCAGC AGGGOGCTOG C GAGCCTG GC 300 

CGGGCGGACA AAGACGGCCC CGAOS GCTCG GCAGTGCCCA TCTAOGTGCC GTTCCTCATT 360 

GTTGGCTCCG TGTTTGTCGC CTTTATCATC TTGGQC5TCCC TGGTGGCAGC CTGTTGCTGC 420 

AGATGTCTCC GGCCTAAGCA GGATCCCCAG CAGAGCCQAG CCCCAGGGGa TAACaSCTTG 480 

ATGGAGACCA TCCCCATQAT CCCCAGTGCC AGCACCTCCC GGGGGTOGTC CTCACGCCAG 540 

TCCAGCACAG CTGCCAGTTC CSVGCTCCAGC GCCAACTCAG GGGCCKGGGC GCCXXfCAACA 600 

AGGTCACAGA CCAACTOTTG CTTGCCGGAA GQGACCATQA ACAACGTGTA TGTCAACATG 660 

CCCAOGAATT TCTCTGTGCT GAACTGTCAG CAGGCCACCC AGATTGTGCX: ACATCAAGGG 720 

CAGTATCTGC ATCCXKCATA CX3TGGGGTAC AOGGTGCAGC ACGACTCTGT GCCCATQACA 780 

GCTQTGGCAC CTTTCATGOA OGGCCTGCAG CCTGGCTACA GGCAGATTCA OTCCCCCTTC 840 
CCICACACCA ACASTQAACA OAAOATGTAC CX»GCGGIGA CTGIATAA 



1 11 21 31 41 51 

I I I I I I 

HNGARRSSVS SSWNAASLLQ UiIAAUiAAO ARASGEYCKG HUMOGWIRI GPQCPERFDG 
OlATICOGSC AUIYCCSSAB ARLDQGGCDN ORQQOW^Pa HADKDGPDGS AVPIYVPFLI 
V6SVFVAFII IXSSLVAACCC RCLRPK(»FQ QSKAPGGHRL HBTimiPSA STSRGSSSRQ 
8STAASSSSS ANSQASAPPT RSQI3ICX3:.PB GTMNNWVNM PTHFSVUfCQ QATQIVPHQG 
QYIiHPPYVGy TVOHDSVPMT AVPPFHDOIiQ POIfRQIQSPP PHTHSEOMK PAVTV 

Seq ID NOs 438 DNA sequence 

HUcleic Acid Accession «: NM_004004.1 

Coding sequence i 1 . . 681 

1 11 21 31 41 51 

I I I I I i 

ATGGATTGGG GCSlOQCTaCA GAaaTCCTG GGGGGMTQA AC3VAACACTC CACCAGCATT 
GGAAAGATCT 6GCTCAOC6T CCTCTTCA'Tr TTTOGCATTA TBAT CCTO ST TGTGGCTGCA 
AAGGAG6IGT GG6GAGA3GA GCAGGCOGAC TTTGTCIGCA ACACCCTGCA GCCAGGCTGC 



349 
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AflGAACXJTGT GCTACGATCR CTACTTCCXX: ATCTCOaCR TCCGGCTATG GGCCCTGCAQ 240 

CTGATCTTCQ TQTCCAGCCC AGCGCTCCTA GTGGCCATGC ACGTGGCCTA CCGGAGACAT 300 

GAGAAGAAGA GGAAGTTCAT CAAOGGGGAS ATAAAOAGTG AATTTAAGGA CATCGAGC3AG 360 

ATCAAAACCC AQAAGGTCOG CATOGAAGGC TCCCTGTGGT GGACXTTACAC AAGCAGCATC 420 

TTCTTCCGGG TCATCTTCGA AGCCXSCXTTTC ATGTACGTCT TCTATGTCAT GTACGACGGC 480 

TTCTCCATGC AGCGGCTGGT GAAGTGCAAC GCCTGGCCTT GTCCCAACAC TOTGOACTGC 540 

TTTGTGTCXIC GGCCCACGGA GAAGACTGTC TTCACACTGT TCATGATTGC AGTGTCTGGA 600 

ATTTGCATCC TGCTGAATGT CACTQAATTG TGTTATTTQC TAATTAGRTA TTQTTCrGGG 660 
AAGTCAAAAA AGCCAGTTTA A 

Seq ID HO: 439 Protein sequence 
Protein Acceooion 8> IIP_00399S.l 

X 11 21 31 41 51 

I I I I I I 

MDWGTLQTIL GGVNKHSTSI GKIWLTVUI PRIMILWAA KEVWGDEQAD PVCNTLQPGC 60 
KNVCYDHYFP ISHIRLWALQ LIFVSSPAIOi VAMHVAYRRH EKKRKPIKGE IKSEPKDIEE 120 
IKTQKVRIEG SLWWTYTSSI FFRVIFEAAF MYVPYVMYDG FSMQRI.VKCN AWPCPNTVDC 180 
FVSRPTEKTV FTVFMIAVSG IClUiNVTEI. CYUiIRYCSG KSKKPV 



Seq 10 NO I 440 UNA sequence 

Nucleic Acid Accession It: XM_06l091.l 

Coding sequence : 1 . . 2 4 8 1 

1 11 21 31 41 51 

ATGCCSkAATA CTTCATCAAC AACCAGGATT GAAATTTGGC TTCTCCAAOA GCCGCCCGGG 66 

CACCXSftGCGC TGGTCGCCGC TCTCCTTCOG GTGAGTCCCA GCCCCGAGTT GGCTCTGGCG 120 

CCCGQQTACX: CGCC31QTGCC GGCTGCCOAT GACCGATTCA CGCTCCCGAT GATTGGAGGT 180 

CAGATGCATG GTGRGAAGGT AGATCTCTGG AGCCTTGGTG TTCTTTQCTA TGAATTTTTA 240 

GTTGGGAAGC CTCCTTTTGA GGC3«ACGAA GTCCATGTAA GCAAAGAAAC CATOGGGAAG 300 

ATTTCAGCTG CCAGCAAAAT GATGTGGTGC TCGGCTGCAG TGGACATCAT GTTTCTGTTA 360 

GATGGGTCTA ACAGCX3TCGG GAAAGGGAGC TTTGAAAGGT CCAAGCACTT TGCCATCACA 420 

GTCTGTGACO GTCTGGACAT CAGCCKCGAQ AGGGTCAGAG TGGGAGCA*r CCAOTTCAGT 480 

TCCACTCCTC ATCTGGAATT CKCCTTGGAT TCATTTTCAA CCCAA CAGQ A AGTGAAGGCA 540 

AGAATCAAQA GGATGGTTTT CAAAGGAGGG CGCAOSGAGA OGGAACTTGC TCTGAAATAC 600 

CTTCTGCACA GAGGSTTGCC TGGAGGCAGA AATGCTTCTG TGCCCCAQAT CCTCATCATC 6 SO 

GTCACTGATG OGAABTCCCA GQQGGATGTG GCACTGCCAT CCAAGCAOCT GAAGGAAAGG 120 

GGTGTCACTG TGTTTGCTGT GGGGGTCAGG TTTCCCAGGT GGGAGGAGCT GCATGCACTG 780 

GCCAGCX3AGC CIAGAGGGCA GCAOGTGCTG TTGGCTGAGC AGOTOGAGGA TGCCACCAAC 840 

GGCCTCTTCA GCACCCTCAG CAGCTCGGCC ATCTGCTCCA GCGCX3«X3CC AGCTGQGAGC 900 

CCCGAGCTTG TCTTCATGGA GCGGTTAATG GGCATCTCTC TOATAGOCCC CTaTGACTCG 960 

CAGCCCTGCC AGAATGGAGG CACATGTGTT CCAOAAGGAC TOOACQGCXA OCRGTGCCTC 1020 

TCCCCGCTGG CCTTTGGAGG GGAGGCTAAC TGTGCCCTGA AGCTGAGCCT GGAATGCaGQ 1080 

GTCGACCTCC TCTTCCTGCT GGACAGCTCT aCX3GGCACX» CTCTGGACGG CTTCCTCCGG 1140 

GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG GCCXSTGCTGR GCGAGQACTC TCGGGCCCX3A 1200 

GTGGGTGTGG CCaCATACAG CAGGGAGCTG CTGGTGGCGG TGCCTGTGGG GGAGTACCAG 1260 

GATGTGCCTG ACCTOGTOX; GAGCCTCGAT GGCATTCCCT TCCGTGGTGG CCCCACCCIG 1320 

ACXSGGCAGTG CCTTGCGGCA GGCGGCAGAQ CGTGGCTTOG GGAGCS3CCAC CaGGACAGGC 1380 

CAGGACCGGC CACQTAGAGT GGTGGTTTTG CTCaCTGAGr CACACTCCGA GGATGAGGTT 1440 

GCGGGCCCAG CGCGTCACGC AAGGQOGOOA OAGCTOCTCX: TGCTGGBTGT AGGCa«3TGAG 1500 

GCCGTGOSGG CAGAGCTGGA GGAGATCACA GGCAOCCCAA AGCaTGTGAT GQTCTACWffl 1560 

GATCCTCAGG ATCTQTTCAA CCAAATCCCT GAGCTGCAGG GGAAGCTGTG CAGCXGGCAG 1620 

CGGCCAGGGT GCCGGACACA AGCCCTGGAC CTCGTCTTCA TGTTGGACAC CTCTGCCTCA 1680 

QTAGGGCCCG AGAATTTTGC TCAGATGCAG AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT 1740 

GAGGTGAACC CTGACGTGAC ACAGGTCX3GC CTGGTGGTGT ATGGCRGCCA GGTGCAGACT 1800 

GCCTTCGGGC TGGACACCAA ACCCACCCGG GCTGCGATGC TCCXSGGCCAT TAGCCAGGCC 1860 

CCCTACCTAG QTGGGGTGGG CTCaUSCCGGC ACCGCCCTGC TGCACATCTA TGhCMAGIG 1920 

ATGACCGTCC AGAGGGGTGC CCGGCXTOJT OTCCCCSUVAa CTOTGGTGOr GCTCACAGGC 1980 

GGGAGAGGCG CBGAiBGATQC AGCCQTTCCT OCCCSUSAAGC TQAGGAAOkA TQGCATCTCT 2040 

GTCTTG6TCB 1000001000 GCCTOTCCTA AGTGAGGGTC TGCjeGAGGCT TOCAGGTCCC 2100 

CGGGATTCCC TGATOCACGT GGCAGCTTAC GCOGACCTGC GGTACCAOCA OGACGTGCTC 2160 

ATTCAGTGGC TGTGTGGAGA AGCCAAGCAG CCAGTCAACC TCTGCAAACC CAGCCCGTGC 2220 

ATGAATGAGG GCAGCTGCX3T CCIGCAGAAT GGGAGCTACC GCTGCAAGTO TCGGGATGGC 2280 

TGGGAGGGCC CCCACTGCGA GAACXX3TGAG TGQAQCTCTT GCTCTGTATG TGrOAGCXJAG 2340 

GOATGGATTC TT6AGACGCC CCTGAGGCaVC ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT 2400 

ACCXOTCCCA GCAACTACAG AGAAGGCCTG GGCACTGAAA TGGTGCCTAC CTTCTGGAAT 2460 
QTCTGTGCCC CAGGTCCTTA G 



1 11 21 31 41 51 

I i t 1 I 1 

MPHTSGTTRZ SltfLLQEPPO HRALVAAUiP VSPSPBLAIiA PGlfPPVPAAD DRFTLPMIGG 
QHHSSgmiM SUamXXEFU VOKPPPBANE VHVSKETIGK XSAASXMMKC SAAVDIMFLL 
DGSNSVGXQS FERSKHFAIT VCDOLDISPB RVRTOAEQPS STPHIiBFPLD SFSTQQEVKA 
RIKRKVFK66 RTBTEUkUClf T,TW""-^^^ HASVPQIHI VTDGKSQGDV ALPSKQLKER 
GVTVFAVGVR FPRMBBI£AIi ASBPRGQHVL LAEQVEDATN GLFSTLSSSA ICSSATPAGS 
PELVPHBRLK GISLIOPCDS QPOQMGGTCV PBGUMYQCL CPLAFGGEAN CAUO-SUJCR 
VDUjFIiUJSS AGTTIiDGFIJl AKVFVKRFVR AVMEDSRAR VGVATVSREL I.VAVPVGEyQ 
DVPDLVHSLD GIPFRGGPTL TGSALRQAAE RQPGSATRTG QDRPRRVWL LTESHSEDEV 
AGPARKARAR BIJ.I.I.aVGSE AVRAEI^IT GSPKHVMVYS DPQDLFNQIP ELQGKLCSRQ 
RPGCRTQAIiD IiVFMLDTSAS VGPENFAQMQ SFVRSCALQF EVNPDVTQVG LWifGSQVQT 
AFOLDTKPTR AAMLHAISQA PYLGGVGSAG TAUJilTOKV MTVQRGARPG VPKAWVI.TG 
GRGAEDRAVP AQKMUJNGIS VLWGVGPVI. SBGLRHLAGP RDSLIHVAAY ADLRYHQim. 



350 



wo 02/086443 

lEWLCGEAKQ PVNLCKPSPC MNEGSCVLQN GSYRCKCRDG « 
GWILETPLRH MAPVQEGSSR TPPSNYREGL GTEMVPTFWI \ 

Seq ID NO: 442 DMA sequence 
Nucleic Acid Accession ft: Eos sequence 
. .2424 



PCTAJS02/12476 



ATGCCCCCTT 
TCTCTCCCTC 
AGCRAAATGA 
AGCGTCC3GGA 
CTGGACATCA 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



TTTGCTGTGQ 
AQAGGGCAGC 
ACCCTCAGCA 
CCXTTGTGAGC 
AGAGGATOQC 
AGAGTGTTCC 
TCGCAGCCCT 
CTCTGCCCGC 
AGGGTCGACC 
CGGGCCAAAG 
CGAGTGGGTG 
CAGGATGTGC 
CTGACGQGCA 



TCAGTAGGOC 
TTTGAGOTGA 
ACTGCCTTCG 
GCCCCCTACC 
GTGATGACCG 



AAGGGAGCTT 
GCCCCGAGAG 
CCTTGGATTC 
AAGGAGGGCG 
GAGGCAGAAA 
GGGATQTGGC 
GGGTCAGGTT 
ACGTGCTGTT 
GCTCGGCCAT 
ACAGGACGCT 



I 

• GGAGGCCGTC 
■ CCATGTAAGC 
: GGCTGCAGTG 
TGAAAGGTCC 
GGTCAGAGTG 
ATTTTCAACC 
CACGGAQACQ 
TGCTTCTGTG 
AOTGCCRTOC 



41 



I 



S HS8CSVCVSQ 780 



51 



I 



TGTTTTCCAG AGTGCCCCCA 
AAAGAAACCA TCGGGAAGAT TTCAGCTGCC 
GAC3VTCATGT TTCTGTTAGA TGGGTCTAAC 
AAGCACTTTG CCATCACAGT CTGTGACGGT 
GGAGCATTCX: AGTTCAOTTC CACTCCTCAT 
CAACAGGAAG TGAAGGCAAG AATCAAOAGG 
GAACTTQCTC TGAAATACCT TCTGCACAGA 



TAACCCACCC 
GCCA6AATGG 
TGGCCTTTGG 
TCCTCTTCCT 
TCTTCGTGAA 
TGGCCACATA 



GGCTGAGCAG 
CIGCTCCAGC 
GGAGATGGTC 
TGOGGTGCTG 



AAGCAGCTOA AGGAAAGGGQ TCTCACTQTG 
GAOQAGCTQC ATGCSICTGGC CAOCGAOCCT 
GTGGAGGATG CCAOCAACOG CCTCTTCAGC 
0CCAC6CCA6 ACTGCftCGGT CGAGGCTCAC 
CGGGASTTCG CTGGCAATGC CXXaVTGCTGG 
OTOCCTTCTA CAGCTGGAAG 



GTGCCTTGCG 



CCGAGAATTT 
ACCCTQACGT 
GGCTGGACAC 



GCTGGACAiGC 
GCGOTTTOTG 
C3VGCAGGGAG 
CTGGAGCCTC 
GCAGGCGGCA 
AGTGGTGGTT 



CAACC3MTC 
ACAAGCCCTG 
TGCTCAGATG 
GACACAGGTC 
CAAACCCACC 
GGGCTCAGCX: 



CTGCTGGTGG 
GATGGCATTC 
GAGCGTGGCT 
TTGCTCACTG 



TGAAOCTQAG 
CCACTCTGGA 
TGAGCGAGGA 
OGGTGCCTGT 
CCTTCOGTGG 
TCGGGAGOSC 
AGTCACACTC 
TCCTGCT6G6 
CAAAOCATST 



CCTGGAATGC 
CGGCTTCCTG 
CTCTCGGGCC 
GGGGGAGTAC 
TGGCCCCACC 
CACCAGGACA 
CQAGGATGAG 
TQTAGGCAGT 
GATGGTCTAC 




ggcctggtgg 

CX3GGCTGCGA 
GGCACCGCCC 
GGTGTCCCCA 
CCTGCCCAGA 
CTAAGTGAQ6 



TGAGAAGCTG TGCCCTCCAG 
TGTATGGCAG CCAGGTX3CAG 
CATTAGCCAQ 



TGCTGCACAT CTATGACAAA 
AAGCTQTGGT GGTGCTCACA 
AGCTGAG6AA CAATQGCATC 
GTCTGCGG»6 GCTTGCAGGT 
TGCGGTACCA CCAGGACGTG 



CGTCCTGCAB AATGGGAGCT ACCGCTSCAA QT6T0GGGAT 
CX»GAACCGT QAaTGOAQCT CTTGCTCTQT ATGTGTGAGC 
GCCCCTGAGG CACATGGCTC CCGTGCAGGA GGGCAGCAOC 
CAGAGAAGOC CKSGGCACTO AAATQGTGCC TACCTTCTGC 
TTAG 



MPPFLLLBAV CVFIiFSRVPP 



FAVOVRFPRW 



ELAI.KYI.IAR 
EBIiHALASEP 
REFAGNAPCM 
VPEGLDGYQC 
RAVLSEDSRA 
ERGFGSATRT 



SVGPGMFAQM 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVIiQ 
RTPPSNYREG 



QSFVRSCALQ 
GTAIiLBIYDK 
LSEGLRRIiAG 
NGSYHCKCRD 
LGTEMVPTFW 



21 
1 

SIiPLQEVHVS 
LDISPERVRV 
GLPGGRNRSV 
RGQKVUJ^ 
RGSRRTIiAVIj 
LCPIiAFGGSA 
RVGVATYSRE 
GQDRPRRVW 
SOPQDI.FHQI 
FEVNPDVTQV 
VMTVQRGARP 
PSOSLIHVAA 
GWEX3FHCENR 
NVCAPGP 



KETIGKISAA 
GAFQFSETPH 
PQILirVTDG 



SKMMWCSAAV 
IiEFPLDSFST 
KSQGDVALPS 



AARCPF2SWK 
NCALKLSIiEC 
LLVAVPTCEY 



PELQGKLCSR 



GVPKAVWLT 
YADIiRYHQDV 
ENSSCSVCVS 



RVFLTHPATC 

QDVPOLVWSIi 
VAGPARHASA 
QRPGCRTQAIi 
TAFGLDTKPT 
GGRGAEDAAV 
I.IEWLG(S»K 
QGNII^FUR 



Seq ID NO: 444 DHA sequence 

Nucleic Acid Accession 8: Eos eequer 

Coding sequence: 89.. 2356 



X 11 
I 1 
GCCCCCTGGC CCGAGCCOCG 
GTCGCCGCTC TCCTTCOGTT 
TGTTTTCCTG TTTTCCAGAG 
AOAAACCMC GGQAAQATTT 
CStrCATGTTT CTGTTAQATG 
GCACTTTGCC ATCACAGTCT 
AOCATTCCAO TTCAflTTCCA 



CCCGGGTCTG 
ATATCAACAT 
TGCCCCXaTC 
CAOCTQCCA6 
GGTCTAACAG 
GTGACQGTCT 
CTCCTCATCT 



1030 
lOBO 
1140 

1260 
1320 
1380 
1440 
ISOO 
1S60 
1620 
1680 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



I 

- DIMFIiLDGSN 
QQEVKARIKR 
KQLKERGVTV 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTIiDGFti 
DGIPFRGGPT 
RBLIjLIiGVGB 
DLVFMU3TSA 
RAAMLRAISQ 
PAQKIiRiniGI 



TGAGTAGAGC CGCCCGGGCA COGAGCGCTG 
GCCCCCTTTC CTGTrGCTGG AAGCCGTCTQ 
TCTCCCTCTC CAGGAAGTCC ATGTAAGCAA 
CAAAATGATG TGGTGCTCGG CTGCAGTGGA 



GGACATCAGC CCCGASASOG TCAGAOTGGG 
QGAATTCCCC TTGGATTCAT TTTCAACCCA 
GGTTTTCAAA GGUUJGOCGCA OGGAGACaQA 
GTTGCCIGGA GOCAGAAATO CTTCIGTGCC 
GTCCCAGGGG OATCTQ6C%C TGCCA'TCCAA 



351 
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15 
20 
25 
30 
35 
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GCAGCTGAAO GAAAGGGGTG 
GGAGCTGCAT GCACTGGCCA 
GGAGGATGCC ACCAACGGCC 
CACXSCCAGAC TOCAGG3TCG 
GQAGTTCGCT GGCAATGCCC 
TGCACACTCT CCCTTCTACA 
CAGGACCACC TGCCCAGGCC 
TCXaVGAAGGA CTGGACGGCT 
CTGTGCCCTQ AAGCTGAGCC 
TGCGGGCACC ACTCTGGACG 
GGCCGTGCFO AGCGAGGACT 
OCtOQTGGCQ oraccTSTGa 
TGGCATTCCC TTCCGTGGTQ 
GCGTGGCTTC GGGAGCGCCA 
GCTCACTGAG TCACACTCCG 
AGAGCTGCTC CTGCTGGGTG 
AGGCAGCCCA AAQCATGTGA 
TGA6CTGCAG GGOAAGCTGT 
CCTCGTCTTC ATGTTGGACA 
GASCTTTGTG AGAAGCTGTG 
CCTGGTGGTG TATGGCAGCC 
GGCTGCQATG CTGCGGGCCA 
CRCCGCCCTG CTGCACATCT 
TGTCCCCAAA GCTGTGGTGG 
TGCCCAGAAG CTQAGGAACA 
AAGTGAGGGT CTGCGGAGGC 
CGCCGACCTG CGGTACCACC 
GCCAGTCAAC CTCTQCAAAC 
TGGQAGCTAC CGCTGCAAGT 
CTTGAGACGC CCCTGAGGCA 
AGCAACTACA OAQAAGGCCT 
CCAGGTCCTT AGAATGTCTG 
GAGGATGTCC CAACTGCAGC 
AAACGATGTT GTTGAAAAGT 
GCXrTTGTTGA GGCTATGTCA 
OACTTAAATT TAGCGaCCTQ 



PCT/US02/12476 



TCACTGTGTT 



GCTGGAAGAG 
CCTGTGACTC 
ACCAGTGCCT 



TGCTGTGGGG 
AGGGCAGCAC 
CCTCAGCAGC 
CTGTQAGCAC 
AGGATCGCGG 
AGTGTTCCTA 
GCAGCCCTGC 
CTGCCCGCTG 
GGTCGACCTC 
GGCCAAAGTC 
AGTGGGTGTG 
GGATGTGCCT 



AGGACGCTGG 
CGGACCCTTG 
ACCCACCXTTG 
CAGAATGGAG 
GCCTTTGGAG 
CTCTTCCTGC 
TTCGTGAAGC 
GCCACATACA 



AQATOGTCEG 
CGGTGCTGGC 
CCACCTGCTA 



CCAGGACAGG 
AGGATGAGGT 
TAGGCAGTGA 
TGGTCTACTC 
GCAGCCGGCA 
CCTCTGCCTC 



GGCCGTGCGG 
GQATCCTCAG 
GCGGCCAGGG 
AGTAGGGCCC 
TGAGGTGAAC 
TGCCTTCGGG 
CCCCTACCTA 
GATGACCGTC 



GCAGAGCTGG 
GATCTGTTCA 
TGCCGGACAC 
GAGAATTTTG 
CCTGACGTGA 
CTGGACACCA 
GGTGGGGTGG 
CAGAGOSGTG 



GGGAGGCTAA 
TGGACAGCTC 
GGTTTGTGCG 
GCAGGGAGCT 
OQAGCCTCGA 
AGGCGGCAGA 
TGGTGGTTTT 
CAAGGGCGCX3 
AGGAGATCAC 
ACXAAATCCC 
AAGCCCTGGA 
CTCAGATGCA 
CACAGGTcisG 
AACCCACCCG 
GCTCAGCCGG 



: TGTCTTGOTC GTGGGCGTGG GGCCISTCCT 



CATGGCTCCC 
GGGCACTGAA 
CTTCXXBCCG 
CATGCTGCTT 
TTTGATGTQT 



CATTGAGTGG CTGTGTGQAQ AAGCCAAGC3V 
CATQAAtGAQ GGCAGCTGCG TCCTGCaGAA 
CTGGGAGGGC CCCCACTGCG AGAACCQATT 
TACCCCTCCC 
TGTCTGTGCC 
ACIGAGGGAG 
TGTCACCCAC 
AAGTAAATAC CCACTTTCT G TACCTGCTOT 



CCCCACTGCG 
GCAGCAGCCG 
CCTTCTGGAA 
CACTATTCTC 



1 11 
I I 
MPPFIiI>IiEAV CVPIiPSRVFP 
SV^CGSFERS KHPAITVCDG 
MVPKGGRTEI ELALKYLLHR 
FAVGVHFPRW EELHAIASEP 
PCEHRTLEMV REFAGNAPCW 
SQPCQNGGTC VPEGLDGYQC 
RAKVFVKRFV KAVL8EOSSA 
LTGSALRQAA BRGFGSATRT 
BAVRAELBBI TGSPKBVMVY 
SV6PEHFAQK QSFVRSCAUQ 
APTLGGVGSA GTALUtlYDK 

sviiWavGPv IjSeclxriag 

CMNEGSCVLQ NGSYRCKCRD 



21 
I 

SLPI4EVHVS 
IJ3ISPERVRV 
GLPGGRNASV 
RGQHVUiAEQ 
RGSRRTUm. 
LCPLAFGGEA 
RVOVATYSRB 
GQDRPRRVW 
SDPQDIiFHQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSLIHVAA 
GWEGPHCENR 



IiEFPLDSFST 
KSQGDVAliPS 
TI.SSSAICSS 
RVFLTHPATC 



GLWYGSQVQ 
GVPKAVWI.T 
TADLRYHQDV 
FLRRP 



QDVPDLVWSL 
VAGPARHARA 
QRPGCRTQAL 
TAFGLDTKPT 



LIEHLOGBAK 



Seq ID NO: 446 DNA sequence 
Nucleic Acid Accession 8: NM_0319< 
Coding sequence : 145 . . 1260 



41 



TGCTCCTCCT 
CCGATCTGGG 
GTAAAGAAGA 
TCCTCTGATG 
TCAGTTOGGG 
GCGAXGAAGT 
CAGCCCTCAG 



ATGTCTGAAT 
QACTCACAAT 
CCTGAACGGA 
GCTCTACCCA 



GTQACCCTTC 
GTCTGCAGCA 
TGCCGTCAGA 
CGAGGCCAGT 
CTGCTGGATC 



TTTGGGAATG 
TATCTC GAAA 
TTTTTTCACT 
AATCAAGTTA 



GCTGTGGGAC 
CACCOGCCAC 
ACTTAAA6AA 
ACAGTTQTGA 
AAGGCTQTAG 
TTCCAGCGCG 
AGAATTCTGT 
AGAAAAGGGC 
TAGAAAGCTT 
CAAGGAGACC 
GAGCTCQTCC 
TOGAGGAGGA 
GCTACATGAA 
CGCATATAAT 
ATTCTCGAGA 
AGACTATTGA 
TCTGTGGCCC 
CGAACTQGCA 
ATGGACGGTQ 



CAGCATGGAC 
ATTCAGATAT 
CAGCTTTGCT 
QACCCGCAGC 
GAGTACCAGG 
GACTGATTCC 
TTTAAATATA 
CCCTGGCTCQ 
GCGAAGGCOT 
TCTTACCAOQ 
GGAGGAAQAQ 
TGAAGATGAC 
TCGCCCAGTQ 
GAAGATATAT 



GCGCGCCCAG 
GCGGCTGCTC 
GCTCGCCGCG 
GTGAAOTTGA 
TCTGATAATT 
CAGTGCAGGC 
GGAGCAACCA 
AACTCCGATT 
AAGCAAAACA 
TTCCGTGGAA 
ACATTCC06Q 



I 



I 



ATTTGCTGCC 
6AAACCT6AG 
ATCTTA&CAG 



CTTGAAAM3C CTOAAACnGG 
TGCCrrCTAC TTCTCAAATC 
TTAAAA ATCT TGATGATCAG 
ACAT6TGTTT CTGGAGCATC 



AATTCGAAAT 
TTTCTTGTAA 
CCTGTTTCRT 
ACAGAAGGTA 



1320 
1380 
1440 
1500 
1560 
1620 



1800 
1860 
1920 
1980 
2040 
2100 
2160 



KETIGKISAA 8KMMHCSAAV OIMFIiUX3Slf 



QQEVKARIKR 120 

KQLKERGVTV 180 

ATPDCRVEAH 240 

YRTTCPGPCD 300 

SAGTTUXJPL 360 

DGIPPRGGPT 420 

RELLLLGVGS 480 

DLVFMLDTSA 540 

RAAMUiAISQ 600 

PAQKLRtlNGl 660 

QPVNLCKPSP 720 



CCTGCCAGCC GCGCTGCTGC 60 

CGCTCTCCCC GCTCCAAGCG 120 

TGCCGCAGAA AGATCTCAflA 180 

TTTCCATGGA AACCTCOTCA 240 

TTGCAAACAC GAGGCTGCAG 300 

ACTCTGGACC TCTCAGGGTG 360 

ACAAAAAAGC AGA6TCCCGC 420 

CAGAAGATGA AAGTGGAATG 480 

AAGCAATGCT TGCAAAACTC 540 

GACATCCCCT CCCAGGCTCC 600 
GTGTTGCTTC CAGQAQAAAC ' 660 



GATAAOTACA TGTTGGTGAa AAAQAlOGAAG 780 
CTGCCCAGAA 
GAAGAAATTA 
AACCGTTCAC 
AACTGCAGAA 
AACC3GTTATG 



CAGAGGAGGA GTTGGAOAAC 900 



TTGTCATCAA 
CTGGGGCGTT 
CA GGGA TGCT 
CAGTTTCTGC 
ATATCATGQC 
GCAAGCKTAA 
AAGTTrCCRA 
AAOAAACTCC 
TATTGCTAGT 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
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TACaCTTTOC CXTTCCTGCAG 
TCTATTTCCA ATGCTCCTCT 
TATGAAAGCA TATTTTATTT 
GAAACRCAAT AATAGTATTA 
CTTGTTTACA CAAAAACGAG 



TTTCTTCTCT 
CKAACCGCTT 
ACTTGGTGTT 
ACTAACTAGA 
TATGATTTAG 



CTCTCAATCC 
CTCCTAATTT 
AAGGTTGGTG 
ATGAGTAAGC 
TTATTTCAOT 



CAAQGTTCAA 
GTAACTTTTA 
CTGTGTCAGT 

AAOTGrrror 

GATGCATAAT 
TATACAAAAG 



TGATTTGAAT 
TCACATGTAA 
AAACTAGTCT 
GTTTAGATTT 
GCAGTTTGTT 

attccccctc 
atgtcx:aatt 



TTTATTTTAA 



tttcagtata 
ggtattgcaa 

GTGGTTCTTT 
TAAGCACTTT 
AACCTGACAT 
CTCTTTGCAT 
TACTTGCATA 
ATCGATAAGT 
TAATAAAATG 



GCTCCCAACC 
AGTTTCTGAA 
GAAATAGCCC 
TCTATTGAAT 
CACTCATACT 
AAAAATTAGG 
CAAGTTGTTG 
GGTGCX3TCCA 
AAATAATGAT 
AAACTTTAGT 
ATAAATTCTT 
GCAGTTTCTT 
TATAACAATG 
CTCTGCCAGT 
TAATCAAGGT 
TGTAAACCAT 
GTAAATACAO 



CCCXTCTCAT 
TTTCTTTTAA 
TCATAAAACC 
TTCAGAQAAG 
AGTTGAAATT 
TAATTATTGC 
TCSVCAQTTOA 



GGAAGAGCTA 
ATAATTGTAG 
GGACAATTTT 



ATAAGTGCCT 
CTAGTTTCTO 
ATTTQGTAOA 
TCCTG TGCCa 
CTTTTGATCT 



AGCATCCKCC 
ATTACAGTTT 
TAAGCACTTG 
AGCCTTCTAA 
TTTAATAGAA 
AOATTGATOT 
GACTTAATTT 
TCATAATTCA 
TCTQQAGATT 
TTTGCAAAGT 
GTATGGAAAC 
AAACCAGGCA 
TTTTGGAGAT 
GGCAGGTTTC 
GGTGGAATCX 
TTCAATGTTT 
GTAATGCTTT 



PCT/US02/12476 



2340 
2400 
2460 



21 



41 



I I I I I I 

MDARKVPQXD liRVKKNliKKP RYVKLISMET S3SSDDSCDS FASDHPAMTR UJSVSEGCRT 
RSQOIHSGPL RVAMKFPARS TRGATOKKAE SRQPSENSVT DSNSDSEDES OWFLEKRAL 
HIKQNKAMIiA KLMSBLESPP GSPRGRHPLP GSDSQSRRPR RRTFPGVASR RNPERBARPL 
rtfRSRSaitGS LDAIiPMEBEE EEDKITKLVHK RKTVDQXMNE I}DLPRSBKSR SSVTLPHllH 
PVEBITEEEIi ENVCSMSREK lYNRSLGSTC BQCRQXTIDT XTNCRNPDCH GVRGQFGQPC 
LRNRYGEEVR DALUIPNHHC PPCR6ICHCS FC31QHDGRCA TOVLVYUUCX HQFGNVHAYL 



Seq ID NO> 448 DHA 
Nucleic Acid Accession tt 
Coding sequence: 1..1314 



I 

ATGTTACAGQ 
AAACCCCGTA 
CTGAGCCTGG 
TACTTCCTCT 
CTGGACTGTC 



ATCCTGACAG 
TCCCCATGGA 
CGAQTATCAT 



TQATCAACCT CTQAACAGCC 



CATTGTGGTT GTCCTCATCA 



GOGGGCAGCC TCTCCACTTC ATOCOTAGGA 



GGGAACTGGT 
AGGCAOATGG 
GATCTGGATG 
GGGCCCTGTC 
AAGACCCCXC 
AGCATCCAGT 
CTCACGQCAG 
GGCTCAOACA 



TCTCTGCCTG 
GCTACAGCAG 
TTGTTGAAAT 
TCTCAGGCTC 
GTGTGGTGGG 
ACGACAAACA 
CCCACTGCTT 



TTTOGACAAC 
CAAACCCACT 
CACAGAAAAC 

TGGGGAGGAG 
GCACGTCTGT 
CAGGAAACAT 
CTTCCCATCC 



TCCACACTGC 
TTCAC3VGAAQ 
TTCAGAGCTG 

CTGCACTGTC 
GCCTCTGTGG 
GGAGGGAGCA 
ACCGATGTGT 



I 

TCGATGTCAA 
TCCCCATCAT 
AGGTGATTCT 
AGCAQCIGTG 
AGAGCTTCCC 
AG6T6CTGGA 
CTCTCGCTCA 
TGGAGATTGG 



ACCCCTGCGC 
CATAGCACTA 
GGATAAATAC 



ATTCTTGGCC 
TCCTGGACCC 
TCAACT6GAA 
CCAAQATCAT 



QACACCTGCC 
GT6GGCAT0G 
AAGGTCTCAO 



3 GCACAQTCAG GCCCATCTGT 
TCTGGATCAT 
TGCAGGCGTC 
GGGAAGTCAC 
AGGGTGACAG 



CCTATCTCAA 



AGTCCAGGTC 
CGAGAAGATG 
TGGTGGGCCC 
CTATGGCTGC 
CTGOATCTAC 



CTGCCCTTCT 
TTTACX3AAGC 
ATTGACAGCA 
A-IGTGXGCAG 
CTQATGTACC 



TTGATQAGGA 

AGAATGGAGG 
CACGGTGCAA 
GCATCCCGGA 
AATCTGACCA 



CTCG6CCACA 360 



1020 

loeo 

1140 
1200 
1260 



AATGTCIGGA AGGCTGAGCT 



CCC3VGACCAG 480 

GAACTCaAGT S40 

GAAGAC5CCTG 600 

TTGGCAGGTC 660 

CCACTGGGTC 720 

GGTGCGOGCA 780 

CRTCATTGAA 840 

GTTCCCACTC 900 

ocTCAcrccav 960 

GAAGATGTCT 
TGCAGACGAT 
AGGGGGTGTG 
GTGGCATGTG 
AGTATACACC 
6TAA 



VFLOQQPiaF 
QIWFSACFDII 
GPCbKSSLVS 



I I I I I 

LNSIUVKPLR KPRIPMETFR KVGIPIIIAL LSLASIIIW VLIKVILDKY 
IPRKQI.CDGE LDCPLGEDEE HCVJCSPPBGP AVAVRLSKDR STIOVIiDSAT 
FTEAIAETAC RfflUGYSSKPT FBAVEIQPDQ DLDWEITEN SQELRMRNSS 
LHCLACGKSL KTPRWOGEE ASVDSWPWQV SIQYDKQHVC GGSILDPHWV 
TDVFNHKVHA OSDKLGSFPS LAVAKIIIIE FNPKYPKDND lALMKLQFPI. 
I.PFFOEBI.TP ATPLWIIGWG FTKQNGGKMS DILU2ASVQV IDSTRCNADO 
MCAGIPEGGV DTCQGDSGGP LMYQSDQMHV VGIVSWGYGC GGPSTPGVYT 



Seq ID NO: 4S0 DNA sequence 
Hucleic Acid Accession #: XM_C 
1..3042 



I I I I t I 

GCTCACCCAQ GAAAAATATG CAATCGTCCX: ATTGATATAC AGGCCACTAC AA TGGAT GGA 
GTTAACCTCA GCACCGAGGT TGTCTACAAA AAAGGCCAGQ ATTATAGGTT TGCTTGCTAC 
GACCX5GGGCA GAGCXTTGCCG GAGCTACCGT GTAOGGTTCC TCTQTGGQAA GCCTO TOAGQ 
CCCAAACTCA CAQTCACXaVT TGAC3VCXaAT GTGAACaaCA OCATTCTGAA crrOGAeQAT 
AATGTACAGT C3VTGGAAACC TGGAGATACC CTGOTCaTXa CCAQTACTQA TTACTCCATG 
TACChGOCAG AAGAGTTCCA GGTGCrrCXX: TGCAGATCCT GCGCCCCCAA CCAGGTCAAA 
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GTGGCAGGQA AACCAATGTA 
GCGGAGGTTG GGCTTCTGAG 
TACCCCTACA GAAACCACAT 
AAGTTTGCTC TGGGATTTAA 
CAGCAGCTCG TGGGTCAGTA 
GGAGGTTATG ACCCACCCAC 
TGCGTCACAQ TCCATGGCTC 
TTGGGCCACT GCTTCTTCAC 
CTTGGCCTCC TTGTCAAGTC 
AAOATGATCA CAGGAGACTC 
GCTGTOTCCA CCTTCTGGAT 
GGATCTGAGG AAACTGGATT 
GGAATGTACT CCCCAGGTTA 
GCACATTCCA ACTACCGGGC 
TCTGCCAAGG ACAAGCGGCC 
QACGCCGACC CGCTGAAGCC 
AACCAGGACC ACGGGGCCTG 
GCTGACAATG GCATTGGCCT 
TCCakAGCAAG AGATAAAGAA 
ATGATGGACA ATAGGRTCTG 
ATAGOCCAGA ATTTTCCAAT 
AACTGCACTT TCCX5AAAGTT 
CX3CCTGAATA ATGCCTGGCA 
GACOTTCaSA TTACTTCCfla 
GACATGGATG GGGATAAGAC 
CCTGGCTCXrr ACCTCACOAA 
GTTCCCGACT GOAjGAQGOaC 
TACAAQACCA GTAACCTGCO 
TACCTGGAGG GGGCGCTCAC 
CTGCAGAAGG GCTAC3VCCAT 
CTCATCAACT TCAAC»AGGG 
ACATTCTCCA TCCTCTCGGA 
GICTTCGTGA GGACCTTGCA 
TACTACTGGG ACGAGGACTC 
GAGAAGTTTG CTTTCTGCTC 
CCAAAGAACG CAGGCGTCAG 
GCTGTCGTAG ACGTGCCGAT 
CATTTCTTGO AGGTQAAGAT 
TTOG CTTACA TTGAAGTGGA 
GTGQTGATTG AOGGGAACCA 
CTGC3UVGGCA TACCATGGCA 
(3TGCTTATGG CATCAAAGGG 



GOCCCATCTG 
CCATCCCTGT 
GGTAGACTAT 
CCCTGCCAGC 
TCAGAGACCC 
GGTGCTGGCC 
TCTGTOCCTC 
CAAAQATCCA 
TTCACAGATC 
CTTGQC 
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GGCAGCACAC 
CCOGATTCAC 
ATACATCAGG 
CAATGGCTTG 



ATAGTGATGG 
TTTGACTTCG 
TTGGAGGGCA 
TTCCACCTGG 
GACCTCTCCA 
TTGATCAAGG 



TGGAACXXTTC 
CTACCCAGGG 
GGCCAATCCC 
TTGGTTTATT 
TTCAGAGCAC 
TGGCATGATC 
GTTCCTCTCA 
CCGGGAGCCG 
GCTGCGOGGC 
GACCCTGGCC 



TAGAGGAATT 
TGTGGCCCTG 
GAGCTGCCCC 
AGTGTTCTTC 
ATCTGTGTTC 
GAATOACAAC 
CATTTQCAOT 
AATGAAGATC 
CAGGAGCACC 
CCACTGGQAC 
CGACTGGATC 
TGTTCACAAT 
6ATGGACAAA 
AGGGCTGTTG 
CATGAAAG6C 
TGACTGCACA 
GCX:CAAGAAG 
QGAGAGTTCC 
TGGGMGAAG 



CTCCCCTCGG 
TACATCCCCA 
AACAACAACC 
TTTCACCACG 
ATTCCACTGG 
ATAGACAACX3 
ATCATCTCTG 
GCCATCATCA 
GGGGATGTOT 
AGTGGTGGAA 
GTTGGCGAGA 
GGCTTGGACC 
CAGTTATATG 
GAGGGCCGGC 
CATAACAACG 
GGAGAGCCTQ 



TAGACGGCX3T 
GGGAGATGGA 
ATACCTTTGG 
COOAGCTGAA 
CCGGTQATGT 
TCCATCATAC 

ACGTTaToaa 

GCAACACTTT 
ACCGTGACAG 
AGCCCAGGCA 
TCATCAACTG 
TACCAACGGG 
GAAAATTCTA 
GAGTCAAAAC 
CCAGATACAG 



GGACaTGCGQ 
GGACAAATGC- 
GGGCCACATC 
GCATATGGGA 
AGACGAAAGG 



CTATAACTCT 
TGACCACTGT 
CAAGATGTGC 
AGACTGCAAT 
TGCCGCTGCA 
CCCCTCCXJTG 
TAACAACCGA 
CACCGAGGCC 
CCCTCACCAG 



TGGCTGGTCC 
GaOTQCTATG 
ATCAAGAATG 

CATTACCAGC 
CAGACGGCCC 
CGAGTGGGGC 
CGCCTGCTGA 



GGCTOOACAQ 
CCTTCCCQTA 
GTGGCAACGT 
ATAGCGGAAG 
ATGGCCCCAT 
ACACX»GCX5C 
TGACCGGCAT 
GGCCCTGOTT 
AOGGCICOQT 



CCGCCGAACT 
TCTGCTACCC 
AGCAAACGTC 



rrCCTGAAGC 



GCCACAGCTT 
CTCTTTGGTT 
AAGCAGCACT 
TAaXrAGTT 



CAAOTTGTGC 
GCCACCTOGT 
GTCCCCCAGC 



CCCCTGGGGC 
TTCTCTCCTA 
GTGCTGACAG 
GGGCTGGTCA 
aAOAAASAGC 



AACAGTTCAT 
OAGAGGTGAG 
GTCCATGTGC 
CCATTTCAGA 



rAGAA 
TTCAGGAGAC 
AGTTCTGAGA 
GGATATCCAC 
AACTAATGCC 
ACTQCAATGC 



GCTTTTCAAC 
AAOATACGTC 
TCTCAAGTTa 
GGTOACACTG 
GGTGAAGAAB 
GAGGGTGACT 
AGCTGCXTTGQ 
TGGTGCTGCC 
AATGCTGGAA 
TTCAGTGGGG 
CTTTGGCAGO 
CCCATGGTCT 
AGQAAAICTT 
CIGGCTATCC 



TATGIGGCGA 



AAAGAGCAAA 



TQAAAGCTCA 
TAAAQATTAA 
ACCCCAAGTT 
CrCAGCTGAA 
TCTTCCACCT 
OSGAGGATGG 
CGAGCTTCAG 
CCATCCCTQA 
CATGGACX»6 



GGGGAC6GAA 
GACCCTCCCT 
CAACATCCAA 
CCTGGCCTTC 
TGCCTTTGAG 
CAACCAQCTQ 
GTCCGAGTAC 
CTGCATCRAT 
CATTCAAQCC 
CCRCCCTCTT 
GGTTGTCACX: 
CGCCATCTGG 
GCGAGGCACC 
CAAGACGGGC 
CAGGAGCC 



GAA0QAGA6A 
AGCTCTGATT 
CACCQAGAGG 
AACAAAGGAC 
CTGQAACGAC 
CATCCAGGTG 
GAACTCCATT 
CAATTCCATA 
AGIGCTGQAA 
TGGCTTCAAA 



AAGAAGTTGT 
CTTGGCAGCA 
GAAGGCCGTG 
ACCTGCCCCT 
ACATTCACTT 
GTTTGGQQAC 
AGCCCTGACC 



GACCAGTGGG 
TTTCAGCCCT 
ACTCAAGTGT 
TCCTGCAGCC 
CATATCAGGA 
CAGCTAGGAG 



TTCCAGAAAT 
TGATATCCAT 
TAGCTTGAGG 
CAGGTGGAGA 



TACTCCTGTA 
TTGGGGAAGA 
AGGTTTGGAC 
CTGCTGCATT 
GATGCTGGGT 



TCACATGGTA 



AATCACMAG J! 



GGATGGCTGG 
GATGGGCCAA 
CTACCTGGAG 
TCTTGGGTGC 
GACCTGGQTT 
6TAGTCTGGA 
GTAAATGTAG 
AACCTCACAG 
GCCTCTGGCC 
TGACTCTCAA 
CCTGGAACCC 
ACACGGGATG 
AGGCAGTCAG 
GAGGCCAGTG 
CTGGGGGCAT 



QGAGKTGAQG 
AAGCTGGTGA 
AGTGTGGTCA 



CTAOQG6GTC 



TCCTGCCTTA GGGCCTCATT 
AGACCCTAGA TGTGCTOGTA 
TATCTAGCCC AAAGCCTTCA 
AACCACACAG CTAAGGGAGG 
T7GCCTCAAC AACCGGCCCC 
OACAAOTCCC CTOOAAGGAA 



CTTCACTTTG 
TGGTGCTACC 
GGGCTCGCCA 
GGTCCACCCC 
ATAGAGAGCC 



TGCCTCTGAC 
TGCTCTTCAT 
CrCCCTCXKSC 
TTTTAACAGA 
GCCTGGGGAG 
AGAGTGCCXSV 
AGGAAATGAC 



QCCCTTTCCT 
ATGGGCTTTG 
TCCAAGAGGO 
OCAGGGAACT 
CTGGQATTTC 
TGGGGAAAGT 
CCCCACCCTA 
GGCACTCCTG 



CAACCACAAA 



CTAATGCAAG 
AATGTTQAAT 
GTTGTACATA 
ACCAAGAGCC 
TTOTOCTCCT 



AGTCCCTTTC 
CAAAGAGCTC 
AGAGTCTCCC 
CTCTTTCCTT 
GTCCRAGAGT 
TCTTTCAGCT 
GGCCA6AGTC 
AAATGACCTC 
GGTCTCACAC 
GTCTTTGGCT 
TGTTTCACAG 
AATATCTAGG 
TOTTATTTCT 



: CCTCCXCTTG 
3 OCTGGGTGCA 
3 CTCTACAGGT 
r GGCTGATCTT 
r TAATGCCCTG 



AGAGCTGGAA 
GAGCCCX:CAA 
GCCCTTGCTG 
AGGTAGCTTC 
TGACAGCTAG 
GTGTTGGCGG 
CAGTAGCTGC 
GAGGCCCAQC 
GGGT6TCTGA 
CTCTCTCCCT 



CTGTAAGAGG GAGAACTCTA TCTGTGGTTT 



CAAAGAGGGC 
CCSVTTCCXKA 
GCTGGGAGGT 
ACAGGAAGGA 
ATGTCCTTCT 
TCTGAACCAC 
CAGTTCATTT 
TACaGGATCr 



GATGAACTAC 
CTGCCTGGCT 
GGTGGGAGCC 
GACCATAGGG 
CTTCTTCCAG 
TGTCCACGGT 
TTAGGATGTO 
AAAAAAGATA 
GTACATAAAA 



ATTTATCCCC 
CCCTCXaCCC 
AACTG TCAOG 
CTCTGCTTTT 
GGAGATTAGT 
TTTGTTGAGT 

ATCAcrrrcA 

TCTATTTGAA 
GTTTCTTTCC 



CTCCCTGCCG 
TGGCCCACTC 
6CACA6AGGA 
AGAAGTGAGC 
GGCCTCCAGG 
ATATAGAAAA 
GATGGGAAAG 
CCACACCACA 
TGGAAATGGG 
CAGATCTCTT 
TCCCTGTGGC 
AACTCCCCAT 
AGAGGOAGTA 
ACAGCTATTG 
GGCCC3MXTT 
ATAATCTTGC 



AACTGCACCC 
GAGOTCTTTC 
AAAGATATGG 
GGTQATGQAG 
TTTCACTCTT 



GTTTGTAAOA CTTAAGTOAG 1 



AGTTCrCAQA 
TAAACCATTC 
GCTTA6AAAA 
AAG6AAA6CA 



1080 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 

16B0 
1740 
1800 
1860 
1920 
19B0 
2040 
2100 
2160 
2220 
22S0 
2340 
2400 
2460 
2520 
2580 



3360 
3420 
34B0 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 



4200 
4260 
4320 
43B0 
4440 
4500 
4S60 
4620 
46B0 

4800 
4860 
4920 
49B0 
5040 
5100 

5220 
5280 
5340 
5400 
5460 
5520 



354 



wo 02/086443 ' 

ACGCTCCTCT GAAATGCTTG TCTTTTTTCT GTTGCOGAAA TAGCTGGTCC ITTTTCGGGA S640 

GTTAQATGTA TAGAGTGTTT GTKIGTKAAC ATTTCTTGTA GGCATCACOV TGAACftAAGA S700 

TATAITTTCT ATTTATTTAT TATATGTCCA CTTCAAGAAQ TCaCrGTCftG AOAAATAAAG 5760 
AATT6TCTTA AATX3TC»AAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 
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MDGVNLSTEV 
I-BDHVQSWKP 
DMRAEVGLLS 
HMGQQliVGQY 
YNSLGHCFFT 
DCNAVSTFWM 
NNSAHSNYRA 
AYKNQDHGAW 
GTEMMDHRIW 
LAFRIiNNAWQ 
SEYPGSYLTK 
HPLytBGALT 
RGTTPSILSD 



RiniVMGEMB DKCyPYHNHI 



ED6PBERNTF 
ANPKNNLINC 
CTIIlnNGVKT 



aPGCIiDHSGR 



NDNWLVItHPD 
RSTHyQQYQP 
VHNRIiLKQTS 
MKGCBRIKIK 
ESSKQHFFHL 



DHCLGLLVKS 
AAAGSEETGP 
TEASAKDKRP 
CRFADNGIGIi 
TI.PIGQNFPI 
AFEDVPITSR 
CINVPDWRGA 
WTLQKGyri 
KTGVFVRTLQ 
ALIPKHAGVS 
WNDFAYIl 



I 

SYHVRFLCGK 
VLPCRSCAPN 
CNFFDPDTFG 
YIRDLSIHHT 
GTLLPSDRDS 
WFIFHHVPTG 
FLSIISARYS 
TIASGGTFPY 
RGIQLYDGPI 
VPFGEPQPWF 
ICSGCYAQMY 
HWDQTAPAEIj 
MDKVEQSYPG 
DCTATAYPKF 



I 

PVBPKLTVTI 
QVKVAGKPMY 
GHIKFALGFK 
FSRCVTVHGS 
KMCKMITGDS 
PSVGMYSPGY 



LBI6EEIDGV 120 



TKDHFLEVKM 

NSILQGIPWQ 1.FMYVAT1PD NSIVLMASKG RYVSRGPWTR 
GFKGSFRPIW 



1 VTLDTEDHKA KIFQWPIPV VXKKKL 



NIQHCTFRKF 
NQLDMDGDKT 
IQAYKTSNLH 
AIWI.IHFHKG 
RSHYYWDEDS 
TBRAWDVPM 
IQWVIIxata 
VLEKLGAORG 



KGIiLIKDVVG 
YPGYIPKPRQ 
SEHIPLGKFY 
REPAIIRHFI 
SLFVGBSQIV 
VALBGRBTSA 
SVFHOVDGSV 
MKIIKNDPPS 
DWIRVGLCYP 
GLLFIiKLKAQ 
PKECLFGSQLK 
QRWSHTSFR 



Seq ID NOt 452 I 
Nucleic Acid Accession ft 
Coding sequence: 261..2Bfil 



GAGCTAGCGC TCAAGCRSAG CCCAGCX3CGG TGCTATCGGA C 
CGGCGCGGGG AGCCAGCGGG GCTQAGCGCG GCC3U3GGTCT C 
AGCTAOCSWCT CCQCTTGOCC ACQCCCCSGG AGCTCGOXSC GCXTrOGCGGT 



TGCTGACCAT 
CTGGGTGCCC 
ACCATGTGCA 
CCATCCACAT 



TATCXSGCXaVG 



ATGGGAGCTG 
ACTCTQACCT 
CCTGAGTTGC 



OCCCTTTCCA 



TCGACCCCAA 
AAGAGAGTGA 
TTGCRGTGAA 
AATTGGGAAG 
TGAAAGGAAA 
CTGCTGCTGC 
CTTTGTCCAO 
TATCTCAQAC 
TATGCAATCG 
AGGTTGTCTA 
GCCGGAGCTA 
CCATTGACAC 
AACCTGGAGA 
TCCAGGTGCT 
TGTACCTGCA 
TOAGCCGGAA 
ACATCIGCAA 



GCACATCXTG 
GGGCAATTTC 
CTATGGTCTG 
AAA GCTCTOC 
CTATTTTTTT 
ATCAGGCACA 
AOGTCTGQTC 
TGATGAAGGT 
CAAACACTTC 
TCXaVTCATCT 
CCGGGTATTC 
TGAOTGOGTr 
TAAAGOTdGQ 
TCCCRTTGAT 



GGCAAGACAC 
GGCAAGCTGG 
ATTGACAACG 
ACX3VTCMTT 
AA6TACATTG 



CTGGGAGGCA 
GCTTCCCTGG 
AACCCTGQAA 



: ACTGTCTOSO 



TGCTCCTCAC 
TCATTAAAGA 
GAGGAGAGCT 
TGTATGGAAG 
GGGTTGGTAA 



CTCTTCTGCC 
CCAOGACGAG 
GCATGCTGGG 
GGCTGATGAA 



51 
! 

CGAGCGCAAO SO 

TTCCC3«3ACr 120 

CAGOSACCAG 180 

CTACAGACCC 240 

TTCAAGGCCA 300 

ACAGTG6CTG 360 

GACCAAGACC 420 

ACGGTCTATT 480 

CCGATTGTTT 540 



6TCATCCATT 
CAGTATTTGA 
TCTCGAAATC 
CTGCACCTTG 



GATTTAGACR 
ACCATATTGA 
AGACAGAGCA 



COGTGTAOGG 1 
CRATGTGAAC f 
TACCCTGGTC 
TCCCTGCAGA 1 
CATOQGGGAG C 



AQTACCOGAT 



GCTCCAATGG 
TCACGGAAGA 
AGTCTGGAAC 
ACTCCTACCC 
GGATGGCCAA 
GATTTT6QTT 
GTTATTCAGA 
GGGCTGGCAT 



TTTCTTTGAC 
ACACTTGGAG 
TCACTTCCAC 
CAGGGACCTC 
CTTGTTGATC 
TGGGCCGGAG 
CCTCCTCCCC 
AGQGTACATC 
TCCCAACAAC 
TATTTTTCAC 
GCACATTCCA 
QATOVTAGAC 



3 GGAAGCCTGT 
: TGAACTTGGA 

V CTGATTACTC 
: CCAACCAGGT 
3 GCGTGQACAT 

V TGGAG6ACAA 



GGCCAGGAAG 
CCCTTGGAQT 
ATATCATGGA 
TGGCQAATAT 
GTGGTTOGAT 
GAAAGCTCAC 
TGGftOTTAAC 
CTACGACEGG 
GAGGCCCAAA 
GGATAATGTA 
CRTGTACCAG 



TTTCTAACTQ 
CATCXJAGGCT 
TTCAATGTTT 
CaTGATAAAO 
CCAGGAAAAA 
CTCAGCACCXJ 
GGCAGAGCCT 
CTCACAGTCA 
CAGTCATGGA 
GCAGAAGAGT 
GGGAAACCAA 
GTTGGGCTTC 



AGCCCCGGGA 
CCTGGCraCG 
AAGGCTTCTT 
CCTCTGGAAT 
CGGGGTCGCC 
GGGAGCAATQ 
CTCTGRCTCC 
TCTTCATCCA 

ccTCJGaccro 



GCOGGCCATC 



GCTTACAGQA 
GGCTCAGGGA 
CTTTGCTCAC 
GGCTTTGCTG 
AAGAGGGTGA 
GGGAACXGAG 
GGATTTCAGA 



GGCACGGAGC 
CTGGCXXiQTQ 
TCCATCCATC 
AAGGAOGTTG 
GAACGCAACA 
TCGGACCGTQ 
CCCSiAGCCCA 
AACCTCATCA- 
CACQTACCAA 
CTGGGAAAAT 
AACGGAGTCA 
TCTGCCAGAT 
ATCAGACaCT 
GTGTGGCTGa 
ATGAAGGCTG 

TTCRGccxrrc 

GTCTCTCTGG 
CTTATQAGCA 
AGTCCACAGA 
CACAGGGGGC 
GCTGGAAATA 



TGAAGCATAT 
ATGTAOAGGA 
ATACATTCTC 
T GGGC TATAA 
CTTTTGACCA 
ACAGCAAGAT 
GGCAAGACTG 
ACTGTGCCGC 
CGGGCCCCTC 
TCTATAACAA 
AAACX^CCGA 
ACAGCCCTCA 
TCATTGCCTA 
ACAGCTGCCA 
GGGGCATTTT 
CCTGCCGCTG 
CCCACTCATG 
CAGAGGAATT 



AAGGGGAGGT 
TCQCTGOOTC 
CTCTTTGQGC 
CTGTCTTGGC 
GTGCAAGATO 
CAATGCTOTG 
TQCAGGATCT 
CGTGGGAATG 
CCGAGCACAT 
GGCCTCTGCC 



r GTTCATCTCA 840 



1140 
1200 
1260 
1320 
1380 



2100 
2160 
2220 
2280 
2340 



2580 
2640 
2100 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



GCTCTGGGAT 
CTGGTGGGTC 
TATGACCCAC 
ACAGTCCATG 
CUCTQCTTCT 
CTCCTTGTCR 
ATCACAGAGG 
TCCACCTTCT 
GAGGAAACTG 
TACTCCCXaO 
TCCAACTACC 
AAGGACAAGC 



CCTGCTGAAO CTGGTOACTA 
ATGGAGAAGT GTGGTCAGAG 
CAGTCCCCAG GCAGCSICTGC 
TGCCTTAGGQ CCTCATTTGC 
CCCTAGATGT GCTCGTACTC 
CTAGCCCAAA GCCTTCATTT 
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TAACAGATGG GGAAAGTGAO 
TGGGGRGCCC CACCCTAGCC 
OTGCXXAGGC ACTCCTGAGG 
AAATGACTAG AGTAGAATOA 
AACCCGCCCT CCCCTTGGTG 
AQCCCAGCXrr C3GGTGCACAG 
TCTGCA6CTC TACAGGTGAG 
CCAATTTGGC TGATCTTGGG 
TGCTCCTTAA TGCCCTGCTC 
TAAGAGGGAG AACTCTATCT 
GTCTTOTGAT GAACTACATT 
AGAOGGCCTG CCTGGCTCCC 
TTCCCCAGGT GGGAGCCAAC 
GGGAGGTGAC CATAGGGCTC 
GGAAGGACTT CTTCCAGGGA 
TCCTTCTTGT CX^CGGTTTT 
GAACCACTTA GGATQTaATC 
TTCATTTAAA AAAGATATCT 
AGGATCTGTA CATAAAAGTT 
\ GCACAAATTT 



CCX:CCAAGAT 
CTTGCTGCCA 
TAGCTTCTGG 



TTGGCGGTCC 
TAGCTGCAAC 
GCCCAGCAGA 
TGTCTGAACA 

GTGGTTTATA 
TATCCCCTTT 
TCCACCCAAC 
TGTCAGGQAG 
TGCTTTTAAA 
GATTAGTGGT 
GTTGAGTTTT 
ACTTTCAGGT 
ATTTGAAAGT 
TCTTTCCTAA 



CACCACATTG 
AAATGGGGAC 
ATCTCTTCCC 
CTGTGGCCTT 
TCCCCATTGG 
GGGAGTAGGO 
GCTATTGGGT 
CCACCTTATA 
ATCTTGCACG 



CACACAGCTA 
CCTCAACAAC 
AAGTCOCCTC 
TC CTGC TCCC 
CTICTTTGTTC 
TGCTACCTGG 
CTCGCCATGT 
CCACCCCAGT 
GAGAGCCCAA 
AGGCACCAGA 



AGGGAGGGCC 311 
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GTCTTTCCCA 
GATATGGCTG 
GATGGAGAGG 
CACTCTTCTA 
GGCCAGGAAT 
TCTCAGAQTT 
ACCATTCACC 
TAGAAAATTG 



TGTAAOACTT AAOTGAGTTA GGTCTTTAAG GAAAGCAACG 



TGTAAACATT 
ATGTGCACTT CAAGAAGTCA 
GAGATGTCCT TTGCATTGCT 
TTGGAAAAAT TTTGCTGTTA 



AATAAAOAAT 
6TACCTAGAG 
CATACSUUVOG 



CCACAAACTC 
AGACTCGGTC 
CCAAACATCT 
CTTCAAAGGC 
AGAGTCAAAA 
ATGCAAGGGT 
GTTGAATGTC 
OTACATATGT 
AAGAGCCAAT 
TCCTCCTTGT 
CTCCTCTGAA 
AGATGTATAG 
ATTTTCTATT 



AGOGCACAC» 
ACTACXTDGTC 
CTCTCCTGTC 
TTCTGGTGAQ 
CCCTTTCAGC 
AGAGCTCCTG 
OTCTCCCTGG 



3240 
3300 
3360 
3420 
3480 
3540 
3600 
3S60 
3720 
3780 



CAAGAGTOCA 3B4 



CAGAGTCACA 
TQACCTCA1G 
CTCACACTGT 
TTTGGCTCAG 
TTCACAGTAC 



TATTTCTGTT 



ASTQTTTQTA 
TATTTATTAT 
GTCATQATTG 



4200 
4260 
4320 
4380 
4440 
4500 



ATGTCAAAAA AAAAAAAAAA 



30 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



MGAAGRQDFL 
GKTU.LTSSA 
TIILYGHADE 
ERSHGBR6VI 



TVAAGCPDQS 
PIVLRTBHIL 
LBLKGQKKLS 
r VIHSDaPDTY HSKKESBRLV 



SRNUDMARK AHTXLGSXHF LHUSFRHFWS 7LTVKGNPSS 



I 

PELQPMNPGH 
IDNGGEI.HAG 
WTFLNKTIiHP 
QYIjMAVPDQR 
SVEOHIEYBQ 
EKISDLMXKH 



DQDHHVHIGQ 
SALCPFQOIF 
GGf4AEGGYPF 



HRGSAAARVP 300 



EIDGVDMRAE VGIOiSHNIIV MQEMEDKCYP 
GTRUOWGOQ LVGQYPIHFH LA6DVDER6G 
KDWGYNSLG HCFFTEDGPE ERNTFDHCLG 
PKPRQDCMAV STFWMANPNN HLliSCAAAGS 
IjGKFYMNRAH SNYRAGMIID HGVKTTBASA 
ISHFIAYKNa 



AEEFQVI.PCR 
YRKHICajFFD 
YDPPTYIRDL 



SC»PNQVXVA 
FDTFGGHIKF 
SIHHTFSRCV 



EBTGFWFIFH 
KDKRPFI.SII 
SAQEOFUiTQ 



HVPTGPSVGM 
SARYSPHQOA 
MKAGOII.I.G6 



laTVTimnvN 

GKPMYLHIGB 
ALGFRAAHLE 
TVHGSNGLLI 
ITEDSYPGYI 
YSPGYSEHIP 
DPLKPREPAI 



Se? ID NO: 454 DNA sequence 
Huclelc Acid Accession <i: NM 013282.2 
i..2466 



31 



41 



51 



OGACTCCPTA 

ACCCACACGG 
CAGGAGCTGT 
GAGGAOGGCC 
GTCCGCCAGA 
ACCX3ACTCCG 



GGGCTGTACA 



ATACCCTCTT 
GCCTCOTGCT 
GCTGCTGCCT 
AGACTGACAG 
AGGTCAATGA 
TGGTCAGGGT 



6AGAA0GGCG 
AAGTGGCAGG 
AAGGAGCGGG 
CGGGAACTCT 
TTOSTGGACG 
CCCATGAGAC 
TGCOGGGTCT 
TGOGATGAGT 



TGGTCCAGAT 
ACCTGGAGGT 
GCTTCTGGTA 
ACGCCAACGT 
AAGTCTTCAA 
QGAAGAGC6Q 
GCGCCTGCCa 
GCGACATGGC 



GTCCOGTCCA 



GGCTGAGAGA 
ACTGGOGCAA 
ACCACTACGG 



I I 
TGGCTCAGAG GTGCTGOTAA AACTQATGGG GGTTTTTGCT 
CACCATGTGG ATCCAGGTTC GGACCATGGA OGGGAGGCAG 
GTCGAGGCrG ACCAAGGTGG AGGAGCTGAG GCGGAAGATC 
GCCAQGCCTG CAGAOGCTGT TCTACAGGOQ CAAACAGRTG 
CGACTACX5AG GTCCGCCTGA ATGACACCAT CCAGCTCCTG 
CCCCCACAGC ACCAAOOAGC GGGACTCCOA GCTCTCCQAC 
GGGCCAGAQT GAOTCAGACA AGTCCTCCAC CCACGGOQAG 
GATGAGGACA TQTGGGATOA GACQGAATTG 
GCTCGGGACA CGAACATGGG GGCGTGGTTT 
GCCCCCTCCC GGGACGAGCC CTGCAGCTCC 
ATTTACCACG TGAAATACX3A CGACTACCCG 
GACGTCCGAG CGCGCGCCCG CACCATCATC 
GTCATGCTCA ACTACWACCC CGACAACCCC 
CGACGOGGAG ATCTCCAGGA AGCGCGAGAC CAGGACGGCG 
GGTGCTGGGG GATGATTCTC IGAACQACTQ TCQGATCATC 
GATTGA6CX30 GCGGOTGAAO GOAOCCOAT GOTTOACAAC 
GCCX3TCCTQC 
CCTGTGCGGQ 
CTTCCACATC 
CTGCXXTTGAG 
GAGCAAGAAQ 
GGGCATGGCC 
ACCCATCCCG 



GTACGTCX3AT 
GACGCGGAAG 
GGAGGACXSTC 
GAACTCCAGG 



GGCCGGCAQO ACOCCGACAA GCAGCTCATO 
TACTGCCTGG ACCXXKXXrCT CAGCAGTGTT 
TGCCGGAATG ATGCCAGCGA GGTGGTACTG 
AAGGCGAAGA TGGCCTCGGC CACATCGTCC 
TGTGTGCGCC GCKCCAAGGA ATGTACCATC 
GGGATCCCCG TGGGCACCAT GTGGCGGTTC 



TTTQCTCCCA 
GTCAGGGTGG 
AACCX3CTACG 
TTTCTCGTGT 



QAGCBTACTC CXTEAGTCCTG GOGGGGGGCT 
TCACATACAC GGGTAGTG6T G6T0GAQATC TTTCOBQCRA CAAOMGAOC 
CTTGTGATCA GAAACTCACX: AACA0C3UICA GGGCGCTGGC TCTCAACTGC 
TCAATGACCA AGAAGGGGCC GAGGCCAAGG ACTGGCGGTC GGGGAAGCCG 
TG03CAATGT CAAGGGTGGC AAGAATAGCA AGTACGCCCC CGCTGAGGGC 
ATGQCATCTA CAAGGTTGTG AAATACTGGC CCGAGAAGGG GAAGTCCGGG 
GGOGCTACCT TCTGCX3GAGG 6ACGATGATG AGCCTGGCCC TTGGACGAAG 
ACOGGATCRA GAAGCTGGGG CTGACCATGC AGTATCCAGA AGGCTACCTG 



1380 
1440 
1500 
1560 
1620 

leeo 
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10 
15 
20 
25 
30 
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GAAGCCCTGG CCAACCGAGA 
CAGGAGGGGG GCTTCGCGTC 
GGAGGTGGCC CGAGCAGGGC 
CCCTACAGTC TCACGGCCCA 
CTGTGGAATG AQGTCCTGGC 
TTGTTCCTGA GTAAAGTQOA 
CGGCCCATCA CXSftCOGTGTG 
CGGGCACAGG TGTTCAGCTG 
CAGGTGAACC AGCCTCTGCfV 
CGGTCSATCTC CAAGCACTTC 
CATCGGCACT GATTTTGTTC 
CCTAAAAAGG TTTGTCTTCC 
GAATTTATGT ATTCTGGCTA 
CATAAAAGCC TGCSVATTTCT 
ACTACGTGGT GTGGAGGCTG 
CAACICTTTA AGAAGGCGAC 
ASCAAGCATC TTCCTGACAB 
TGGCCCGTGG CAGCCCGTGQ 
AAAOAGOAAA CATCTCGGGC 
TGCTTAGCGT CTGAGATCCG 
CACGCAGAAA TGGCCTCAAG 
TGTCCGACGA AGGCGGCCAC 
GATTCGTTCC TTCTTTCTAA 
QTCAACCAGA TTCTASAAAC 

GGAACccrrrr gagccttata 

CTTACAAOAiS GGTTTTTTTT 
TTTTTTTTQT AGTTACTGTA 



GCGAGAGAAG GAGAACAGCA AGAGGGAGGA GGAGGAQCAQ 
GAAGTCGGCA 
CAAGGTGGAG 
CAACGCCAAG 



PCT/US02/12476 



CGGGTCCCCG 
GCAGAGCAGC 
GTCACTCAAG 
GGAOACOTTC 
CCAOCACAAC 
CCCTGCCTGC 
GACOGTCCTC 
TCGACAGGCG 
TTAGTGGGCT 
TTTTTTTTTA 
AAAGTTCGAC 
CGACAAAACA 
TTCATGTTTC 



CX3CCGGACAT 
CTCATCAGAG 
GACCGGCOGG 
CAGTOTATCT 



CGAGC6GCAG 



TCCCOGGCTA 
AACGTGTOGG 
AGGTAGTGTT 
TCAAATCTAT 
TGTGTTTAGT 
TTTTAAAGAT 
TTCTCAGAAG 



GCTOGTGTTC 
CAGATCCTTT 
CTATGCCATG 
CGGCAATGGC 
AGGGCTCGTT 
TCCTCCGTTC 
ACATTTTCAG 
TCTTTGAAAA 



AACCAGCTCT 
TTTTGCTGAA 
TAACTTAAAC 
TTTTTATTTT 
TTCTCAGTAT 
ACACAAGATT 
TGGTGTCAAS 
CTTCTCTAGG 

CATTTTQTCA TCTAAAGTCC AGTGACATGG TTCCCCGTGG 
CTCAGCTGTC 
CCTTTGCCTC 
CCTCTGCCCA 
TCCACGTGGG 
CAGCACACGA 



TTGCTGCCAC 



CTAGTTCAAA 
CGTGAAAAGT 
GGGACTCTGC 



2 CCCACCAGAC 



AGACGACAGT 
T6CX3QTCATC 
GATCATTTAC 



CAGTTCTTCC 



TTGCAGCCTA TACCTCAATA 
GAGCAATGTT ATTTTTAAAG 
AGGGAAGAAT GAGACAATTT 
TTAOATTCTC AOAATAAATG 



TATGTACCAA 
TTTTOAAAOG 
AAACAGOGAT 



ATTTTAAATC 
ACCTCCTTAT 
TTTTCTAAAQ 
ATTGAAAAAA 



AGTCACGTGC 
AGCACTGAAT 
TGACACXGGA 
TTTAACTCAG 
GAACACATTT 
ACXiTTAGGGT 
TTTTCTAATT 
ACATACCTGC 
TCTTAGATTA 
TCCAGTACTT 
AAAAAAAA 



2100 
2160 
2220 
2280 
2340 
2400 
24G0 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



TGACTGACGC 
ARGTGCGTTT 
TATTGAAAAT 
TGGGTGCTTG 
CAAGTGAGAA 
TCTAAATGAA 



3240 
3300 
3360 



TTACCAAAGT 
AGACAAACTG 
TTAATGTATT 
TGTCCAGATT 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



I I 

MWIQVRTMDG RQTHTVDSLS 
YEVHLNDTIQ LLVRQSLVLP 
PAOGOMWDGT BLGLYKVNKY 
DVIVHVKYDD VPBNGWQMN 
ABISRKRETR TARBI,YANW 
SCKHCKDOVN RLCRVCACHL 



RLTKVBELRR KIQELFHVEP 
HSTKERDSEL SDTDSGCCLG 
VDARDTNMGA WPEAQWHVT 
SRDVHAHART IIKWQDIJW3 
I^DSUIDCH IIFVDEVFKI 
CGGRQDPDKQ LMCDECDMAF 
KKKAKMASAT SSSQRDWGKG 
VHRFHVAGIH GRSHDGAYSIi 
LTHTHRAIAIi HCFAPIHDQE 
SGFLVWRYU. 



GLQRLFYRGK 
QSESDKSSTH 
RKAPSRDEPC 
QWMLNVNPD 



QMEDGBTIiPD 



HIYCLDPPLS 



gSTSRPALBE 
NPKESaFWYO 
□NPMRRKSGP 
SVPSEDEWYC 
TrVPSNHYGP 



IPGI7VGTHH SFRVQVSBSG 
SGGHDLSQIK RTABQSCDQK 
GGXHSKYAPA EaNRYOGIYK 
LOLTMQYPEO YLEAIiAHRER 
SPRRTSXKTK VEPYSLTAQQ 
TFQCICXX5EL VFRPITTVCQ 



Seq ID NO: 456 DNA sequence 

Nucleic Acid Accession fti NM_001200.1 

Coding sequence: 325.. 1514 



3 KPVRWKNVK 



GGGGACTTCT 
TGCCCCAOCO 
TGCCCGACAC 
GAGAAGGAGO 



OGGASAATAA C 
CGCCATCTCC C 
TTCCCAGCX5T G 




41 
I 

: CCACTTTQCQ 

: GccccrccAC 

3 ACTQCGCGQC 
: TTGCGCCAGG 
: CCGCGTGCTT 
3 CTGTCTTCTA 



ACCATGAAGA 
TCTTTAATTT 
TCC6AGAACA 
TTTATGAAAT 

TOATGOGGTQ 



CCACCGGTTG 
ATCTTTGGAA 
AAOTTCTATC 
GATGCAAGAT 
CATAAAACCT 



CTAQACCIGT 
GAGAGGGCAG 
GAACTACCAG 
CCCavCGOAGQ 



ATGAACACAG 
GGCATCCTCT 
AGTCCAGCTG 
GGATTGTGGC 
TGGCTGATCA 
ACTCTAAOAT 
ACCTTGACGA 



ACAAGGTGTC 
CTGGTCACAQ 
CCACAAAAGA 
TAAGAGACAC 



GCAACAGCCA 
AATQCAAGCA 
GGACACGCCA 
TCCAAGAGAC 
ATAAGGCX3VT 



TCTGAACTCX: 
TCCTAAGGCA 
GAATGAAAAG 
CTAGTACAGC 



(XTTTGTACG 
TATCACGCCT 
ACTAATCATG 
TGCTGTGTCC 



AAAATTAAAT 



CCAGCGGAGC 
AAACGAGTGG 
AGTTTATCAC 
ACTATAGCAO 
ACTCGAAATT 
G6TGGGAAAG 
ACCATGGATT 
ATGTTAGGAT 
TGCTAGTAAC 
AAGCCAAACA 
TGGACTTCAG 
TTTACTGCXaV 
CCATTGTTCA 
CGACAlQAACT 
AGAACTATCA 
ACATAAATAT 



TGACGAGGTC 
ACCCACCXXa: 
CTCa«3GTCAG 
CAACACTGTG 



CTCAGCAGAG 



CCCXBTGACC 
TTTTGATGTC 
CGTGGTGGAA 
AAGCAGGTCT 
TTTTGGCCAT 
CAAACAGCGO 
TGACGTGGGG 
CGGAGAATOC CCTTl 



TCCTTTGACC 240 

CTTAGACGGA 300 

GCGTTGCTGC 360 

CGCA6GAAGT 420 

CTGAGCGAGT 480 

AGCAGGGACG 540 

CCOGGCTCAC 600 

CX3CAGCTTCC 660 

CGGAGATTCT 720 

CTTCAGGTTT 780 

CQAATTAATA 840 

AGACTTTTGG 900 

ACCCCCGCTG 960 
GTGGCCCaiCT 
TTGCACCAAG 
GATGGAAAAG 
AAACGCCTTA 
TGGAATGACT 



: AACTCTQTTA 



1020 
lOBO 
1140 

1260 
1320 
1380 
1440 
ISOO 
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MVAQTRCLIA lOiLPQVLLGG AAGLVPELGR RKPAAASSGR PSSQPSPEVL SEFELEUiSM 60 

PGLKQRPTPS RDAWPP'nfij DliTfRRMSGQP GSPAPDHRLB KAASRMITVR SFHHEESLEE 120 

LPETSQKTTR RFFFHtSSIP TBEPITSAEI. QVFRBQKQDA. LGHNSSFHHR IKIVBIIKPA 180 
TANSKFFVTR hWT 

Seq ID K0» 458 IMIA sequence 

Nucleic Acid Accession #: NM_001999.2 

Coding sequence: l..e736 

1 11 21 31 41 51 

I I I I I 1 • 

ATGGGGAGAA GACGGAGGCT GTGTCTCCAO CTCTACTTOC TGTGGCTGGG CTGTGTGGTO 60 

CTCTGGGCGC AGGGCACGGC OGQCCACCCT CAGCCTCCTC C6CCCAASCC GCCCX MGCCC 120 

CAGCCGCCGC CGCAACAGGT TCGOTCOQCT ACAGCAGGCT CTOAAGGaX} 6TTTCTAG0G 180 

CCCXSWytATC GCGAGGAGGG TGCCXJCAGTC GCCAGCCGCG TCCGCCGGOG AGGACAGCAG 240 

GACGTQCTCC GAGGGCCCAA CGTGTGCGGC TCX»GATTCC ACTCCTACTG CTGCCCTGCiA 300 

TGGAAGACGC TCCCTGGAGG AAACCAGTGC ATTQTCCCX5R TrTGTAGRAA TAQTTGTGGA 360 

GATGGATTTT GTTCCCGTCC TAACATGTGT ACTTGTTCCA GTGGGCAAAT ATCATCAACC 420 

TGTGGATCAA AATCAATTCA GCAGTGCAGT GTGAGATGCA TGAATGGTGG GACCTGTGCA 480 

GATGACX3VCT GCCAGTGCCA GAAAGGATAT ATTGGAACTT ATTGTGOACA ACCTGTCTGT 540 

GAAAATGGAT GTCAGAATGG TGGACGTTGC ATCQCCCAAC OOTGTGCTTG TGTTTATGGG 600 

TTCACTGGTC CACAaTGTGA AAGAGATTAC AfiOACAGOCC OGTGTTTCAC TCAGGTCAAC 660 

AACCAGATCT GCCAAGGGCA GCTGACAaOC ATTCTCTGCR OaAGACTCT GTGCTGTGCC 720 

ACCACTGOAC GQGCGTGGGG CCATCCCTGT GAGATOTGTC CAGCCCAGCC TCAGCCCTGC 780 

CGACGGGGTT TCATCCXTCAA CATCCGCACT GGAGCTTGCC AAGATGTTGA TGAATGCCAG 840 

GCTATCCCAG GGATATGCCA AGGAGGAAAC TGTATCAATA CAGTGGGCTC TTTTGAATGC 900 

AGATGCCCTQ CTGGTCACAA ACAGAGTGAA ACTACTCAGA AATGTGAAGA CATTGATGAG 960 

TGCAGCATCA TrCCTGGGAT ATGTGAAACT GGTGAATGTT CCAACACCX5T GGGAAGCTAT 1020 

TTTTCTGTTT GTCCACGTGG ATATGTAACC TCAACAGATG GCTCTOGATG CATCGATCAG 1080 

AGAACAGGCA TGrrOTTTCTC GGGCCTGGTG AKIGGCCGCT GTGCACAAGA GCTCCCGGGG 1140 

AGAATGAOOA AAATOCAQTO CK3CTGTGAG CCTGGCOGCT GCTGGOSCAT CS3GAACCATT 1200 

CCTGAAGCCT GTCCTOTCAG AGGTTCTQAG GAATATCGCa GACTTTGCAT GGATGGACrr 1260 

CCAATGGGAG QAATTCCAGG GAGTGCTGGT TCCAflACCTG GAGGCACTGG GGGAAATGGC 1320 

TTTGCCXXaA GTGGCAATGG CAATGGCTAT OGCCCAGGAG GGACAGGCTT CATCCCCATC 1380 

CCTGGAQGCA ATGGCTTTTC TCCTGGCGTT GGGGGAGCCX3 GTGTGGGGGC CGGGGGACAG 1440 

GGACCTATCA TCACTGdACT AACAATTCTG AACCAOACAA TAGATATCTG TAAOCATCAT 1500 

GCTAACCTTT GTTTAAATGG ACGCTGTATA CCAACTGTCT CAAGCTACCG ATGTQAATGC IS SO 

AACATGGGTT ATAAGCAGGA TGCAAATGGA GATTGTATAG ATGTTGATGA ATGCACATCA 1620 
STTAAC ACACCTGGTT CCTATTATTG TAAATGTCaiT 



GCTGOATTCC ASAOGACTCC TACCAAGCAA GCATGCATTG ATATTGATGA GTGCATCCAG 1740 

AATGGGGTTC TITOTAAAAA CGGTCQATCC CTGAACTCAQ ATGGAAGTTT CCAGTGCATT 1800 

TGCAATCCXX3 GCTTTGAATT AACTACAGAT GGAAAAAACT GTOTTGATCA TGATGAATOT 1860 

ACAACTACCaV ACATGTGTTT GAATGGAATG TGCATCAATO AA6KIGGCAG CTTCAA GTOC 1920 

ATCTGCAAAC CAGGATTTGT CTTGGCTCCA AATGGGCGTT ACTGTACTOA TGTTGATGAA 1980 

TGCCAGACCC C31GGAATCTG CATGAATGGG CACTGCATCA ACAGTQAAGG GTCCTTCOGC 2040 

TGIGACTGTC OCCCAGGCCT GGCTGTGGGC ATGGATCGAC GTGTOraiGT TGATACTCRC 2100 

ATGCGCAGTA CCTGCTATGG AGGAATCAAG AAAGGAGTGT GTG TGCGT CC TTTCCCCGGT 2160 

GCAGTGACCA AGTCCGAATG CTGCTGTGCC AATCCAGACT ATGGTTTTGG AGAACCCTGC 2220 

CAOCCATGCC CTGCAAAAAA TTCAGCTGAA TTCCACGGCC TTTGTAGTAG TGGAGTAGGT 2280 

ATCACTGTGG ATGGAAGAGA TATCAATGAA TGTGCTTTGG ATCCTGATAT ATGTGCCAAT 2340 

GGGATTTGTG AAAACTTACG TGGTAGTTAC CHTTGTAATT GCAACAGTGG CTATGAACX» 2400 

GATGCCTCTG GAAGAAACTG TATTGACATT GATGAATGTT TAOTAAACAG ACTGCTTTGT 2460 

GATAACGGAT TGTGCCGAAA CAOGCCAGGA AOTTACAGCT GTACGTGCCC ACCACGGTAT 2520 

GTGTTCAGGA CTGAOACAGA GACCTGTGAA GATATAAATG AATGIGftflAQ CAACCCATGT 2580 

GTCAATGGGG CCTGCAGAAA CAACCXTGBA TCTTTCAATT GTGAATQTTC OCCCGGCAGC 2640 

AAACTCAGCT CCACAGGATT GATCTGTATT GACAGCCTOA AGGGGACCTG TTGGCTCARC 2700 

ATCCAGGACA GCCGCTaXGA GQTGAATATT AATGGAGCCA CTCTGAAATC TCAATGCTGT 2760 

GCCACCCTCG GAGCCGCCTG GGGGAGCCCC TGXGAGCGGT GTGAACTAGA TAC3«3CTTGC 2820 

CCAAGAGGGC TTGCCAGGAT TAAAGGTGTT ACGTGTGAAG ATGTTAATGA GTGTGAGGTG 2880 

TTCCCTGGCG TTTGTCCAAA TGGACGCTGT GTCAACaGTA AGGQATCTTT TCATTGCGAG 2940 

TGCCCTGAAG GCCTTACGTT GGATGGGACT GGCCGTGTAT GTTTGGATAT TCGCATGGAG 3000 

CACI6ITACT TGAAST6GGA TGAAQATGAA TGCATCCACC CCGTTCCTGG AAAGTTCCXSC 3060 

ATeOATOCCC GCTGCTGTGC TGTOGQQGCB GCTTGGGOCA COSAGTGTGA GGAGTGCCCC 3120 

AAACCIOGCa CCAAGGAATA OOAGRCACTG TQCCOCCXSOG GGGCTGGCTT TGCTAACCGA 3180 

QGGGATGTTC TTACTGGGOS GCCATTTTAC AAAGACATCA ATOAATGCAA AGCATTTCCT 3240 

GGGATGTGCA CTTATGGGAA GTGCAGAAAT ACAATCGGAA GCTTCAAATQ CCGTTGCAAT 33 00 

AGTGGCTTTG CTCTAGACAT GGAGGAAAGA AACTGCACGG ACATCGACGA GTGCAGGATT 3360 

TCTCCTGACC TCTGTGQCAG TGGAATCTGC GTCAATACAC CGGGCAGCTT TGAGTGCGAG 3420 

TGCTTOGAAa GCTATGAAAG TGGCTTCATG ATGATGAAGA ACTGCATGGA CATTGA CGGA 3480 

TGTGAAOGTA ACXSnXTTCCT TTGTAGGGGT GGCAOCTGTG TGAACACTGA GGGCAGCTTT 3540 

C3M3TQTGACr QCCCACTGQQ ACACOAGCTG TCACCATCCC GTGRQQACTG TGTGGATATT 3600 

AATGSUVTGCT CCCTOAGTOA CSUWCTCTGC AGAAATG6AA AATQTGTGAA CATGATTGGA 3660 

ACCTATCAGT GCTCTTGCAA TCCTGGAXAT CAGGCTACGC CAGACCGCCA GGGCTGTACA 3720 

GATATTGATG AATGTATOAT AATGAACQOA QOCTQTGACA CKCftGTQCAC AAATTCAGAG 3780 

GOAAGCTACG AATGCAGCTG CAGTGAGGGT TATGCCCTGA TGOCaGATGG QAGATCGTGT 3840 

GCAGACATTG ATGAATOTGA AAACAATCCT GATATCTGTG ATGGCGGCCA GTGTACXawiC 3900 

ATTCCTGGAG AGTATCX3CTG CCTCTGCTAT GATGQCTTCA •PGGCTTCCAT GGACATGAAA 3960 

ACATGCATTG ATGTCAATGA ATGTGACCTA AATTCAAATA TCTGCATGTT TGGGGAATGT 4020 

GAGAACACAA AGGGATCCTT CATTTGCCAC TGTCAGCTGG GTTACTCAGT GAAGAAGGGQ 4080 

ACCACAGOAT GTACAGATGT GGATCAGTGT GAAATTGGTQ CTCATAACTG COACATGCAT 4140 

OCCTCATOTC TGAATATCCC AGGAAOCTTC AA3TGTAQCT OCaGAGAAGG CTGGATTGOA 4200 

AACGGCATCA AGTGTATTGA TCTCSACOAA TOTTCTAATG GAACCCACX» GTGTAGCATC 4260 

AATGCTCRGT QIGTAAATAC CCOGGGCTCA TACOGCTGTG CCTGCTCCS3A AGGTTTCACT 4320 

GGTGATGGCr TTACCTGCTC AOATGTT6AT eAGTSTGCAQ AAAACATAAA CCTCTGTQAG 4380 
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AACXXaVCAGT GCCTTAATGT CCCGGGTGCh TATCGCTGOG AGTGIGAGAT GGGCTTCRCT 4440 

CCAGCCTCAG ACAGCAQATC CTGCCAAGAT ATT6ATGAAT GCTCCTTCCA AAACATTTGT 4500 

GTCTCTGGAA CATGTAATAA CCTGCCTG6A ATQTTTCATT GCATCTGCGA TQAT6GTTAT 4560 

GAATTGGACA GAACAGQAGG GAACTGTACA GATATTGATG AGTGTGCAGA TCCTATAAAC 4620 

TGTGTCAATO QCCTATGTGT CAACACGCCT GGTCGCTATG AGTGTAACTG CCCACCCGAT 4680 

TTTCAGTTGA ACCCAACTGG TGTGGGTTGT GTTGACAACC GTGTGGGCAA CTGCTACCTG 4740 

AAGTTTGGAC CTCX3AGGAGA TGGGAGTCTG TCTTGCAACA CCGAGATCX3G GGTGGGCGTC 4800 

AGTCGCTCTT CATGCTGCTG CTCTCTGGGA AAGGCCTGGG GAAACCCCTG TGAGACATGC 4360 

CCCCCTGTCA ATAGCACTGA ATATTACACC CTGTGTCCCG GAGGTGAAGQ CTTCAGACCT 4920 

AACCCCATCA CAATCATTTT AGAAGACATT QAC6AATGCC AGGAGTTACC AGGTCTCTGC 4980 

CaGGGIGOAA ACTGCRTCAA CACTTTTGGG AGCTTCCRQT GTGAGTGCCC ACftAGGCTAC SC40 

TACCTCAGCG AGGATACCOQ CATCTGTCAG GATATTGATG AGTGTTTTGC ACKTCX::TGaT SlOO 

GTGTGTGQGC CTGGGACCTG CTATAACACC CTGGGAAATT ACAOCTGCAT TTGCCCACCT 5160 

GAGTACATGC AGGTCAATGG AGGCCACAAC TGCATGGACA TGAGAAAAAG CTTTTGCTAC 5220 

CGAAGCTATA ATGGAACCAC TTGTGAGAAT GAGTTGCCrT TCAATGTGAC AAAAAGGATG 52B0 

TGCTGCTGC3V CATATAATGT GGGCAAAGCT GGGAACAAAC CTTGTQAACC ATGCCCAACT 5340 

CCAGGAACAG CTGACTTTAA AACCATATGT GGAAATATTC CTGOATTCAC CTTTGACATT S400 

CACACAGGAA AAGCTGTTGA CATTGATGAA TGTAAAGAGA TTCCAGGCAT TTGTGCAAAT 5460 

G6T6TGT6CA TTAACCAOAT TGGCAOTTTC CGCTGTGAAT GCCCTACAGQ ATTCAGTTAC 5520 

AAT6ACCTGC TGrTOGTTTG TGAAQATATA GATGAGTGCA GCAATQQTGA TAATCTCTGC SS80 

CAGCGGAATO CAGACT6CAT CAATAGTCCT GGTAGTTACC GCIGTGAATO TGCOGCOGGT S640 

TTCAAACrrr CACCCAATGO GGCCTGTOTA GATCQCAATG AATGTTTAGA AATTCCTAAC 5700 

GTTTGCAGTC ATGGCTTGTG TGTTGATCTG CAAGGAAGTT ACCAGTGCAT CTGCCACAAT 5760 

GGCTTTAAGG CTTCTCRGGA CCAGACXATG TGCATGGATG TTGATGAGTG CGAGCGGCAC 5820 

CCATGTGGAA ATGGAACTTG TAAAAACACC. GTTGQATCCT ATAACTGTCT GTG CTAC CCA SB80 

GGGTTTGAAC TCACTCATAA TAATGATTGC CTGGACATAG ATGAGTGCAG TTCCITTTTT 5940 

GGTCAGGTGT GCAOAAATGG ACGTTGTTTT AATGAAATTG GTTCTTTCAA GTGTCTATGT 6000 

AACGAAGGTT ATGAACTTAC OCCAGATGGC AAAAACTQTA TAGACACTAA tOAbTGTOTC 6060 

GOCCTTCCOO QCTCTTGCTC TCCTGGTACC T6TCAGAATT TGOAGGGATC CTTCAGATGC 6120 

ATCTGTCCCC CAGGGTATGA AOTAAAAAQC OAOAACTGCSV TTGATATAAA TGAATGTGAT 6180 

GAAQATCCCA ACATTTGTCT TTTTGGrTCC TGTACTAATA CTCCAGGGGG CTTCCAGTGC 6240 

CTCTGCCCCC CTGGCTTTGT ACTATCTGAT AATGGACGGA GATGCTTTGA TACTCGCCAG 6300 

AGCTTCTGCT TCACAAATTT TGAAAATGGA AAGTGTTCTG TACCCAAAGC TTTCAACACC 6360 

ACAAAAGCAA AATGCTGCTG TAGTAAGATG CCAGGAGAGG GCTGGGGGGA CCCCTGTGAG 6420 

CTGTGCCCCA AAGACGATGA AGTTGCATTT CaM3GATTTGT GTCCATATGG CCATGGAACT 6480 

GTCCCTAGTC TTCATGATAC ACGTGAAGAT GTCAATGAGT GTCTTGAGAG CCCAGGCATT 6340 

TGTTCAAATG GTCMXGTAT CAACACCBAC GGATCTTTTC GCTGTGAATG TCCAATGGGC 6600 

TACAACCTTG ACTACACTGO AOTAGGCIGT QTGGATACTG ATGAGTGTTC AATCGGCAAT 6660 

CCGT6TGGAA ATGGTACATG CACCAATSTT ATTGGGAGTT TTOAATGCAA TrQCAATOAA 6720 

GGCTTTGAGC CA«3GCCCAT 6ATGAATTGT GAAGATATCA AOGAATGTGC CCAGAACCCA 6780 

CTGCTGTGTG CTTTAOGCTG CATGAACACT TTTGGGTCCT ATGAATOCAC QTGCCCGATT 6840 

GGCTATGCCC TCAGGGAAGA TCAAAAGATG TGCAAAGATC TGGATGAATG TGC TGAAGGG 6900 

TTACACGACT GTGAATCTAG GGGCATGATG TGTAAGAATC TAATOGGCAC CTTCATGTGC iS960 

ATCTGCCCTC CTGGAATGGC CCGAAGGCCC GATGOAGAAO GCTGTQTAeA TGAAAATGAA 7020 

TGCAGGACCA AGCCAGGAAT CTGTGAAAAT GGACGTTGTG TTAACATTAT TGGAAGCTAT 7080 

AGATGTOAGT GTAATQAAGG ATTCCAGTCA AGTTCTTCAG GCACTGAATG CCTTGACAAT 7140 

CX3ACAGGGTC TCTGCTTTGC AGAGGTACTG CAGACAATAT GTCAAATGGC ATCCAGTAGT 7200 

CS3CAATCTCG TCACTAAGTC AGAATGCTGC TGTGATGGTG GGCGAGOCTG GGGCCACCAG 7260 

TGCGAGCTTT GCCCACTTCC TGGAACTGCC CAGTACAAAA AGATATGTCC TCATGGCCCA 7320 

GGATATACAA CTGATGGAAG AGATATTGAT GAATGTAAGG TAATGCCAAA CCTCTGCACC 7380 

AATGOTCAGT QCATCAATAC CATGGGCTCA TTCOGATGCT TCTGCSkAQGT TGGCTACACC 7440 

ACAGACATCa GTGGAACCTC TTGTATAGAQ CTTGATQAAT GCTCCCAQTC CCOGAAACCA 7500 

TGCAACTACA TCTGCAAGAA CACTGAGGGG AGTTATC3«ST GTTCRTGTCC GAGGGGGTAT 7S60 

GTCXTTGCAAG RGGATGQAAA GACATGCAAA GACCTTGATG AATGTCAAAC AAAGCAGCAT 7620 

AACTGCCAGT TCCTCTGTGT CAACACCXTTG GGGGGGTTTA CCTGTAAATG TCCACCTGGT 7680 

TTCACACAGC ATCaVCACTGC TTGTATCGAC AACAACGAAT GTGGGTCTCA ACCTTTGCTT 7740 

TGTGGAGGAA AGGGAATCTG TCAAAAOVCT CCAGGCAGTT TCAGCTGTGA ATGCCAAAGA 7800 

GGGTTCTCTC TTGATGCCAC CGGACTGAAC TGTGAAGATG TTQATGAATG TQATGGGAAC 7860 

CAtSWSGTGCC AACACGGCTG CCAGAACATC CrGGGTGGCT ACAGATGTGG CTGCCCCCAA 7920 

GGCTACATCC AGCACTACCA GTGGAATCAG TGTGTCGATG AGAAT6AATG CTCC3UVTCCC 7980 

AATGCCTGTG GCTCTGCTTC CTSCTACAAC ACCXnXSGGGA GTTACAAOTO CXSCCTOCCCX: 8040 

TCGGGGTTCT CCTTCGACCA GTTCTCCAQT GCCTGCXaiCG AOBTGAATOA GTGCTCGTCC 8100 

TCCRAGAACC CCTGCAATTA OGGCTGCTCT AACAGGGAGG GGGGCTACCT CTGTGGCTGC 8160 

CXXXXriGGGT ATTACAGAOT GGGACAAGGC CaCTGTGTCT CAGGAATGGQ ATTTAAC3kAO 8220 

GGGCflCTACC TGTCACTGGA TACAGAGGTC GATGAGGAAA ATGCTCTGTC CCCAGAAGCA 8280 

TGCTACGAGT GCAAAATCAA CGGCTATCCT AAGAAAGACA GCSUSGCAGAA GAGAA6TATT 8340 

CATGAACCTG ATCCCACTGC TGTTGAACRG ATCAGCCTAG AGAGTGTtXSA CATGGACAGC 8400 

CCCXSTCAACA TGAAGTTCaUV CCTCTCXXaC CTOGGCTCTA AGGAGCACAT CCTGGAACTA 8460 

AGGCCCGCC3V TCCAQCCCCT CAACAACCAC ATCCGTTATS TCATCTCTCA AGGGAACGAT 8520 

GACAOairCT TCCGCATOCA CCRAAGGAAT GGGCTCAGCT ACTTGCACAC GGCCAAQMG 8S80 

AAGCTCATGC CCS3GCACATA CACACIGGAA ATCACTAGCA TCXXnCTCTA C RAGAA GAAG 8640 

GAGCTTAAGA AACTGGAAGA GAGCAATGAG GATGACTAOC TCCTASGGGA GCTTGGGQAG 8700 

GCTCTCAQAA TGAGGCTGCA GATTCAGCTC TATTAACCGT TCACAGACTT GGGCCX»GaC 8760 

TCAAATCCTA GCACAGCCAG TCTGCAGAAO CATTTGAAAA GTCAAGGACT AATTTTAAAG 8820 

AGGAAAAATA ATAATAACTC TTGTTTCTTT CCTCCCTGTC TTAGACTTTG AATGTTQACC 8880 

CTCACAGGGA GGQATAATTT AGACTCIGGT ATGGCCAAAG ATTTGAGCTC AAAGGCAACC 8940 

GTGGTTACTG TATTrrTTAT ATAACTTCAT TTTAAAATAT ATTAAAAGAA ACCTA AATGT 9000 

TCaAGATATC AGCATATGGC ACTAAATGCA CAAAAATAAT GTGAGCTTTT TTTTTTTTTT 9060 

CCIGTTAGCA GXCI6TAACA CTTTGGQTAT TTTGCTATAG TTGCTAATTA AAAAAATATA 9120 

GATGTTTATT TATTTTTAAT GCAOTAATAT ATGGAGAAAT GAACAAACTA TGTAAACAAA 9180 

AAGGGAAACT CACTTGTTTT TCTTTAGATT TATAAATTTG AGCTATTTTT TTTAGAGGTG 9240 

CTTTTTAAAA ATCCAATAGA TACAAGAGAT GTTTCCTTTG GTTTTCTGCC AGTCATCCAG 9300 

CTGATACACA CCTGATCGAT TTTAAAGAAA GCX3VCACAGA GCTGAATCXK3 GCAGTQCTRA 9360 

TCAATAATTT AAAAGACATG AATGTCATTA GATCCTTTAT AAOGTAGATC GAAGCCAAAG 9420 
CAGCTCATTT GTGAC3VACAT TTCATATCAC CAGACACACC AGGCAACAGA AGTTGAAGCA 9480 

CAACXaCTGT AGCAAAATAC CTTGACTGCT TGTGAGACCA TTAGCATTGC AGGCCAAACC 9540 
GTACTGTATT TCCTTCTCAT AACCTCAAGG AACXSITATGT GCTACCCACA ACACCTCATT 9600 
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10 
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CTTACCCAOG GTGCX3CTGCG 
TTGAAAGGGA ACACCTGGCA 
ATTATGTTCyv AGTTATTTCA 
GGAATATATG TTGTTGTTGT 
TQTAGTTATA CACCATATGC 
ACAATQAATT GATGTTTAGT 
TATTAAGABC AOGTATCCAT 
CCAAACCTCA TATGTGAAAT 
CTGTGCTGAC CAAAGATTAG 
TTAATAACIA AAAAAAAACT 



TCCTCATGGT 



GGATTGCCAT 
TGTTTTAAAC 
CTCaVTTTTAT 
TTGCTTTAGT 
TATTCTTCrC 



ACTGTAGGCA 
TTTCGTGCTG 
ATG TGCAAAC 
CCATTTTTTT 
CATAGCCTAT 
CATTTAAAAA 
AACCCAAQAA 



GCTGAAGAAC 
TCTTAAATAA 
AAATCATGCA 
TTTAGAATTT 
TGTGTATGAA 
GATATTGTAC 



V TACCCASTAT TTTGAGGTTT TATTGTTTTT 



CGCCGTTCCC 9660 
TGGTGCATTT 9720 
ATGCAGCCAA 9780 
TCATTAATAC 9840 
AGATGTTTGT 9900 
CAGGATCTGC 9960 
GQACCAGIGA 10020 
CCICTCAAAC 10080 
10140 
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MGSItRRLCLQ LYPIiWLGCW LMAQGTAGQP QPPPPKPPRP QPPPQQVRSA TAGSEGGFIA 
PEYREEGAAV ASRVRRRGQQ DVLRGPMVOG SRPHSYCCPG HXTLPGQNQC IVPICBNSCG 
DGPCSRPNMC TCSSGQISST CGSKSIQQCS VRCMNGGTCA DDKCQCXIKGY IGTYCGQPVC 
ENGCQNGGRC lAQPCACVYG PTGPQCERDY RTGPCFTOVN NQMCQGQLTG IVCTKTLCCA 
TTGRAWGHPC EMCPAQPQPC WtGFIPNIRT GACQDVDECO AIPGICQGQJ CINTVGSFEC 
HCPAGHKQSB TTQKCEDIDE CSIIPGICET GECSNTVGSY PCVCPRGYVT STDGSRCIDQ 
RTGMCFSGtiV NGRCAQELPG RMTKMQCXXB PGRCWGIGTI PEACPVRGSE BVRRLCMDGL 
PI«3aiPGSA6 SRPGGTGGNG FAPSQIGIIGY GPGGTGFIPI PGGNGFSPGV GGAGVQAGGQ 
GFIITGLTIL HQTIDICKHH ANLCLNGRCI PTVGSYRCEC MMBYHQDAN6 DCIDVDECTS 
NPCTNGDCVN TPGSYYCKCH AGPQRTPTKQ ACIDIDBCIQ NGVKaCMGRC VNSDGSFQCI 
CWAGFELTTD GKNCVDHDEC TTTNMOJJGM CINEDGSPKC IC3CPQFVLAP NGRYCTOVDE 
CQTPGIC3«JG HCINSBGSFR CaJCPPGLAVG MDGRVCVDTH MRSTCYGGIK KGVCVRPFPG 
AVTKSECCCA NPOIGPGEPC QPCPAKNSAE FHGLCSSGVG ITVDGRDINE CALDPDICAN 
GICENLRGSY RCNCMSGYBP DASGRNCIDI DEGLvilRLLC DNGLCRNTPG SYSCTCPPGY 
VFRTETETCB DINECESNPC VNOAGRNNLQ SFNCECSPGS KLSSTGIICI DSliKGTCMIiN 
IQDSRCEVNI NGATLKSECr AXliOAAHGSP CEECELDTAC PRGLAHIKGV TCEDVNECEV 960 
FPSVCPNGRC WSKGSFHCE CPEGLTLDGT GRVCLDIRME QCYIJCWDEDE CIHPVPGKFR 
MDACCCAVQA AWGTECEECP KPGTKEVETL CPRGAGPRNR GDVLTCRPFY KDINECKAFP 
GMCTYGKCRN TIGSFKCRCIt SGFAU3MEBR NCXDIDECRI SPDLCGSGIC VNTPGSFECE 
CPEGYBSQPM MMKNCMDIDG CERNPU.C31G GTCVNTEGSP QCDCPLGHBL SPSREDCVDI 
NECSI.SDNLC RNGKCVNICG TYQCSCNPGY QATPDRQQCT DIDBCMIhajQ GCDTQCTHSB 
GSYECSCSEG YALMPDGRSC ADIDECENNP DICDGGQCTN IPGEYRCI.CY OGFMASMDMX 
TCIDVNECDI. NSNICMPQEC ENTKaSPlCH CQLGYSVKXG TTGCTDVDEC BIGAHircDMH 
ASCtNIPOSF KCSCREGWIG HGIKCIDLDB CSNGTHQCSI NAQCWHTPGS YRC3W:SBGFT 
QDGFTCSDVD ECAENINUCB NGQCLNVPGA YRCBCEMGFT PASDSRSCQD IDECSFQMIC 
VS6TCNNIJ>0 MFHCICDDGY EIJ3RTGGNCT DIDECMPIH CVMGLCVNTP QRYECHCPPD 
FQUNPTCVGC VDNRVGNCYI. KFGPRGDGSI. SCWTEIGVGV SRSSCCCSIX3 KAWGNPCETC 
PPVHSTBYYT LCPGGBGFRP MPITIILEDI DECQELPGLC QGGNCINTFG SFQCECPQGY 
YLSEDTRICE DIDECFAHPG VCGPGTCYHT LGHYTCICPP EYMQVNGGHN CSffiMRKSFCY 
HSYNGTTCEN ELPFNVTKHM CCCTYMVGKA GNKPCEPCPT PGTADFKriC GHIPGFTFDI 
HTGICAVDIDE CKEIPGICAN GVCIHQIGSF RCECPTGPSY MDIiLLVCEDI DECSNGDNIiC 
QRNADCINSP GSYRCECAAG FKI^PNaACV ORNBCI.BIPH VCSBQLCVDL QGSYQCZCHK 
GFKASQDQTM CWDVDECERH PCGNGTCKHT VGSYMCLCYP GFELTmmDC UJIDECSSFF 
GQVCBNGRCF NEIGSFKCLC NEGYELTPDG KNCIDTMECV ALPGSCSPGT CONI-EGSFRC 

icppgyevks ehcidinecd edpmiciifgs ctmtpggfqc lcppgfvlsd hgrrcfdtrq 
sfcftnfeng kcsvpkafnt tkakcccskm pgegwgdpce lcpkddevap qdlcpyghgt 
vpsijHdtred vneclespgi csngqcintd gsfrcecpmg ynldytgvrc vdtdecsign 

PCGNGTCTNV IGSFECNCNE GFEPGPMMNC EDINBCMNP LI-CALRC3WT FGSYECTCPI 
GYALREDQKM CKDLDECAEG 1.HDCESRGMM CKNLIGTFMC ICPPGMARRP DGEGCVDEHE 
CRTKPGICEM GRCVNIIGSY RCECMEGFQS SSSOTECUMI H<iai.CFABVI. QTICQMASSS 
RNLVTKSBCC CDGGRGWGHQ CELCPLPGTA OYKKICPHOP GYTTDGRDIO BOWMPNUrT 
NGQCINTMG3 FRCFGKVQYT TDISGTSCID U>ECSQSPKP CHYICKHTEQ SYQCSCPRGX 
VLQEDGKTCK DU3ECQTKQH NCQFLCVNTL GGFTCKCPPQ FTQHHTACID HNECGSQPLL 
CGGKGICQNT PGSPSCECXJH GPSLDATOm CE0VDECDGN HRCQHGOONI LGGYRCGCPQ 
GYIQHYQWKQ CVDENECSNP HACX3SA2CYN TLGSYKCRCP SGFSFDQPSS ACHDVNECSS 
SKNPCNYGCS NTEGGYLCGC PPGYYRVGQG HCVSGMGPNK GQYLSLDTEV DBENALSPEA 
CYBCKINGYP KKDSRQKHSI HEPDPTAVEQ ISLESWDMDS PVNMKFNLSH LGSKEHILEL 
RPAIQPUffilH IRYVISQOID DSVFRIHQRM GLSYIOTAKK KLMPGTYTI^l ITSIPLYKKK 
g ALRMRIiQIQIi Y 



1140 
1300 
1260 
1320 
1380 
1440 
1500 
1560 



1860 
1920 
1980 
2040 
2100 



2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 



Seq ID HOI 460 UNA sequence 
nucleic Acid Accession ft: »M_013372.1 
: 63.. 617 



I 



GCGGCCGCAC TCAGOGCCAC GCGTGGAAAG 
GTATGAGCGB CACAQOCTAC ACGQXGGGAO 



GOGGCCAAGa G 



AGCAGACCAT 



CCTGCTCCTT 
AACTACAGCC 
0CATCX3ATTT 
AGQAA6TCOC 
AACCCCCAGC 



CCACGAGGAA 
CTCTTTCTAC 
CTGCAAGCCC 
ACCTACCAAG 
GOATTAAaCC 
AGACCTAAAA 
TGCCTCCTGO 



AAATAOCTGA 
GGCTGCAACA 
ATCCCCAGGC 
AAGAAATTCA 
AAGAAGAGAG 
AAATCCAGGT 



CXK»GGCCCC 
CCCTGCTTCT 
AAGGTGCCAT 
CCCAGCAGCC 
GGGAGGAGGT 
AaCGAGACTG 
GTCGCACCAT 
ACATCCX3GAA 
CTACCATGAT 
TCACACGTGT 



3 TAAACATATC T 



TGGCTCCAGG 
6CTGGAGTCC 
GTGCAAAACC 
CATCAACCGC 
GGAGGAAGGT 
GGTCACACTC 
GAAGCAGTGT 
TGTCCTAGGA 
TTAAACCTAG 
GTTOQTGTGC 
AGTCTCTGCT 
ASAAACCCAC 



I 

: CGCACTQACA 
ACCCTGCTGC 
GACAAGGCCC 
AACOGGGGGC 
AGCCAAGAGG 
CAGCOGCITA 
TTCTGTTACG 
TCCTTTC3W5T 
AACTGCCCTG 
CGTTGCATAT 
ATGCAGCCCC 
AGGCCAGAAG 
ATGAGTGTGG 
AGAGAGCACT 
CTCACCCCX3G 
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CTCACATCTA AAGGGGCGGG 
GOGGACCAGA ATCTCCTTTC 
GACCTGTTTT AGTGCTGCAT 
CTTTCCTCCT CCTCCTCACA 
ATCTCTTGTT TGCCAAGGTT 
TTTTGTGAAG ACCXZTCCAGA 
TGGAGTGAGA AAGGGAGGGT 
GACATTGCAG AAGCTTGAAA 
TTTTCCTAGT ATTTAACRGA 
ATTAACTTTG GCOGTTGCAA 
ACCACTCCTA TGTTC6GACC 
CCCTCAGOTG GAAAAGAGAG 
AAACCKCAGA CGCTGAAATT 
TCCATTOCAC TATTTCCCAT 
GCCTCTGCTG AGT6TACCTG 
TTTTAGCAAG AHATATTKTG 
RAGAGAAGAC GACGAGAGTA 
GGTGTTAATA CCTGGTAGAA 
AGGATCTGAG GGGACCCTGT 
CTACTGSnG GATGGACATA 
TCTGATTAAA CTTGGCXTAC 
AGGGTGGGTQ AACTTTATTG 
TTTTATATAC AAACTCCCTG 
AGTCCTATGT AATATGGAAA 
TCTGGCATTC AGAGAACCCT 
TCCAGTGCTC TCCCATCTAA 
ACACCCAAAA TGTTGGGTCT 
CTAGGCGAAT TTGTCCAAAC 
CCAAATCTTT GTATTGTCCA 
ACACAAATGC TAAQGCAGAA 
ATGTAAAACX: ACACCAGGGA 
TTGTTGTTTT AACTCTGCCA 
AGCAGTAATC TTCTTTTAGQ 
CAAGTAAAGA GAATTTCCTC 
TAAAAGOVTA TCACTAGCCA 
GACTAGTACA AATGTGGTGT 
TTTTATTCGA GTCACTGATG 
TTATQGCAAG ATATTTGTGG 
TGAATTTTAT GATGTACACT 
GTCTGTAAGT TGTTTTTTGT 
TGGAGGAGAO GATAATTTCC 
ATTTAATGTA 
TTCTCCTTTA 
TTAGAGTCTT TTATCTOGTC 
GAGTCAGTGC CTGAATCTTT 
AAGCAATATT AAGAAAGACT 
ACGAATAGCA GATAATGATG 
AATGAAAAGT GGATAAACAG 
AGTTCTATTO ACATTCCTCA 
AAATCATTTA AAAACXK3CAA 
ATGAAAGGGG AOTTGATAGT 
ACXAOAATTT AATTTTCACC 
TAAATTAAAC CTATTCTTTC 



GCCQTGGTCT 
GGAATGAATG 
TCX3ACATGGA 
ATCCATCTCT 
CCTAAATTAA 
CTCTGOOAGA 



GGTTCTCJACT 
TTCATGGAAG 
AAAGTCCTTT 
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ACCCAAGTGA 
TCTGCTCAAA 
CAAGCAAGTT 
GTAG TTT AG A 
CCTAATACCT 
AATGCTTCTG 
ACAGTAAGTC 
GGGGTCTTTT 
AGGAAATAAA 



TTCACTTAAC 
GGCTGGTGTG 
GCCAAATCAG 
AGAACACAGG 
ACAGAGGAGA 
CCTAACACCA 
AGCTAAACCA 
ACTCTCTGCA 
TTCCTTTATC 
AGAGCCACTA 
TAAAGATGAR 



ATAGTGACTA 
CATGATGCAA 
GGCAAGGACA 
GTCCAGCAAA 
CTGATGCTTC 
AATCAGATTG 
AACTGAAAAC 
AACCAACTCC 
TAGGGGTGGG 
GTGGTTATAG 
ACTTGATTGA 



TGTCAGTCTA 
ATGTTTTTCA 
AGCAGGATAG 



TGAGAAAGTC 
CCAGAAAGTO 
ATAAATACTG 

AATTAATCAA 
TCAGCTCATT 
TAAAGATCCT 
GACTACTCTG 



TGTTTTAACT ATTGTCAGGA GATTGGGCTA 



TAGGAGAGCA 
ACTATTGTAA 
TGGCAATGGC 
TACTTTGGAT 
AATACTCTTT 
ACAAACACTG 
TGCAACTCGA 
CAACTAAACA 
GATTTTCaAA 
ACATAGTGTG 
CATTCTCCAA 
TTTTGAGGGT 
GGAAAAATGA 
CAAQAATGCA 
AGCTTGTACC 
AACACTAACT 
AAGAGGGAAA 
GTCTTCCAAC 
A TGTA ATGAT 
TCTTGATCAT 



GGGRATTGCC 
TATGACCTCC 
TAGCATCATG 
CTATTCAGTA 



TCrGGCTAGA OAOTAAOTTA 



TTGGTTAACC 
TTGCCTTGTA 
CAGACTTGAG 
GAAGCTGTTT 
GGAGCCATTT 
CTTTTAAACT 



ATGTATTAGC 
TTTACTGGTA 
TGATCTAAGG 
TGTTTTCTTC 
TCTTCTCAGC 
ATTCAGTTGC 
TTATTTCGTT 
CAAGGOGGOA 
CACTACTQAT 



TGTTCATCTG 
GGCACTGTCC 
GCCAAAGTGC 
AAGCCTGAGG 
CTCCTAGCCA 



CAATAAAGCA 
GGGAGAGAAG 
CATTCAGAAC 
ATTTCGT7AA 



CAGAGTG6AT 
AAAAGGGAAA 
CAGCAAACAC 
TGGAGATGAC 



TACTGTAGGT 
ACIGTGTGGA 
CAAATCCTTT 



ATA1 
ACCTATTAAA 
CATTAAAASA 
CTTCAAAGTT 
ATOIGAATAG 
GSTCACTGTO 



AATGCCATAT 
ATTATTATAG 
ATAATGCCAA 
AAAAAACACA 
AAGAGTGTAA 
TTAAATGAAA 



TGACCCCACC 
TTAATTAAGC 
GAAGCTGAAA 
TGAATTTCTC 
TTAAGTTGGC 
AGATTTGGCT 
OTAACTACCC 
CCTATATTAA 
CTATACCATA 



ACACCAAATA 
CATCCTGGAA 
GTGAAAAATC 
AGtTATGGTT 



1080 

1200 

1320 
1380 
1440 
1500 



1740 
1800 
1860 
1920 
1980 
2040 
210O 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



AACATTTATA 
AGATATTIAA 
AGAATTATAT 
CTCATAAAAC 
CCAATAATGT 



TTGAATQTTC CTTAAAGGTT 
TTTtGGRAGA CTTACGATGC 
ACATAAAfiTC CTTTTAAGGA 
AGTGATCAGT TAATGCCTAA 
TATCAACTCC ATTATGTATT 
AGACTATGAG QTACCTTGCT 
TAATTTGQCr TC3kAGTTTCA 
TCTATATAGC CTTTGCTAAA 



ATGTCTTCCT 
AACATTTCTA 
ATGTATACAA 
GAAAATCTAA 
GAGTGAAAGT 
ATGTCTGCTT 
GTGTAGGAGG 
TGAATCTQTA 



3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 



11 



21 



31 



41 



51 



60 
65 
70 
75 
80 
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MSRTAYTVQA I.LU.]jQTU.P AAESKKKQSQ GAIPFPOKAQ HNDSEQTQSP QQPGSSNBaR 
GQGRGTAMPG EEVLBSSQEA WTVTERKYIiK RDWCKTQPI^ QTIHBBGCMS RTIINRFCXG 
QCHSFYIPIIH IRKBE6SFQS CSFCKPKKFT TMNVTIiNCPE LQPPTKKXRV TRVKQCRCIS 
lOLD 

Seq ID MO I 462 DMA sequence 

Nucleic Acid AccesBion lis Soa sequence 

Coding sequence: 1..2733 



31 



AKJAAAGTra 
TTCCTGGGQA 
CATCTAGGCC 
GAGAAAAGAG 



AACTGCTACC 
AGCCAGAGTG 
AGGTTTACAA 
ATTGAAATTC 
ACKCAATTTC 
TCTQAACTOC 
CTGTTTCCAT 
GTCTTTGGAT 
GGAAACATCA 
CTCTCTCTGC 



GAATTATCAG 
CCTGTGAAGA 
TTCACACGGC 
TCAATTTCTG 
ATGACCTTTT 
AACtTAAAAA 
GAAATGGAA6 



CATCAAAACA 
ATATCAGCTG 
TTTTCTGAAG 
AGCAAAGGCT 
CAGCTACACC 



TAGAAGACX36 
TTGGGTCCAA 
CAGOCAAGTS 
TTGAAGAACT 
CATCCTTOGT 



TQAGAGAACA 
GAATTCATCT 
AGCATATGAA 
CATCGTTOCr 
TGAACATGTT 
CTCTTTCAGA 
GGATGATGAA 
TGAGTCCTCT 



I 

TTCTTCACCT 
AAAAAAGAAC 
CTGCTTCAGG 
CTCTTGAAGC 
ACCACAGACT 
TGGTTTCCTC 
CCAAGCTGTG 
AAGATTTGGG 
TCTGCTATAT 
AGAATTCAAG 
GGGTATGAAG 
GCCGAGAAGG 
GTGTTOGGAA 
TATACCCIGC 



41 
I 

TCACTGACXK3 
TCATTGTGAA 
TGACCTATAG 
CTCCATTATT 
GCAACAGCCT 
CCTCATGCCT 
AATGTCATCT 



CCACX3GTGGC 



ACTCCAAATA 
GTTTTGAGTC 
TTGTTGGCTC 
CTAAGACAGC 
AAGCCCAQT6 



AGATTCCAAG 180 

ATGGTCACAT 240 

GAATG6ASTC 300 

TGATCXWSkO 360 

CAACAACCTC 420 

AATTAATGAA 480 

TGCAAATGGA 340 



GGTTCAGGTC 
CaGCAGTOCA 
CCTTCACAAO 
TAATOACATT 



GCAAAATCTT T 



TCATCAGGGA GACTTGTQTO 
TTGTAGGCAA TQCX»CIGAG 
TTGGGCAAAA CCCATCAACC 
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AOVGTGGGGA ATCTGGCTTC 
GCCAGCCATT TC3VCK3GTCTC 
ATCCTTAATT CAGCCTCAGT 
AGCTCACGGT TACTAQAGAC 
CCTCTOAATT TTTCTCX3«3AA 
CTCAAAA«30G GTTACAGCTA 
AGAGGCOSTG TGTTAATTGG 
AGCATGGCCT CGTTGACTCT 
GTCAATGGAC CTGTGATATC 
TTTTTTTCCA AGATAGAQTC 
CATTTGCAGT GGAACGATGC 
TGCCAATGTA CTCACTTGAC 
ATCTTCCCCG TTGTAAAATG 
ATTTTATGCC TGATCATCGA 
CACACACGTC GTATTTGCAT 
TTTATTGTTG GIGCCACAGT 



CAATTCAACA 
AACCAACTGG 
ATTAGAAAAC 
ATTCATTGAC 



GTCAGACCAA 



GQGGAACATT 



ATTCTGAGCA 
ATGGAGGATG 
ACAGTCTTAC 
ATCAGCACTC 
TGGAAAOGGA 
ATQTGTCCCC 
TTCCAOAGAT 



ATATTTCATC 
TCATCaGTAT 
TGCGGGAAGA 
TG6TGCCTCC 
TTOCAGTGAA 



AAACCTGAGC 
AGGCTGCCAC 
CTCCTTCT CC 



CTACCC6TTT 
CAAAACTATT 
CAGCCTCATT 
CTAGTQAATG 
ATATTGATGT 
GTGGGACTGG 
TCGAAGCAGA 
GCCCTGTCCC 



CTTCCAGA 



AAACTCAAGA 
CACCTTTTGT 
GTATCTCC3VT 



CTGCTGGCTT 
GTTGGATTTT 
ACGCAACCTA 



OTGGTGCTGC 
GATGACAAGG 
GGGCTCACCT 
GTTATTTTTG 
TTGGACAGTA 



ACGGGATCAT 
GCCTGGGTTA 
GCAATACCTA 
TCCTGGCTTT 



CCTOGTGTTC 
TGGGTGCCCT 
CAAAAGGAAA 
TGTTGTCCCT 
AAAGCTCTGG 



CTCATTATAT 
GATGTGTGTT 
GCACTGCCTA 
AGGCCGACTG 



CC31CCATCAT CXSSCGTGQGG AAOAGCCTCC 
ATAGTCGACA 
GGATTTTTTA 
AACAAGTTGT 
TCTGCCAAAC 
TTTTCTCATA 



CTTTACTCAA 
A6CTG0QACA 
A6CAAAACTC 



TQCATTCCAG 



V CTCAGTTTGT 



ATCAGATTTA 
CCATTATGCA 
CTCAAATGAA 



TCTTGATTGC 
CTGGAGTCTG 
GQATGCTCAT 
CCCAOCATTT 
CTGTCATTAC 
GGCTTAACTG 
TTGTGGCTGT 
TTGGGGAAAG 
TCATTCTGAC 
GCCAGAATCT 
TCTTATGCTT 
CTGCCTTAAG 
CCAAATTCTC 
CTGGAGATTC 



TCTGTCACTG 
AGCTGACAAT 
AAAGTATGCC 
GACAGCTCTT 
CAAAAGCCAA 
TATTCCCATC 
AACTATTATC 
AAATGCTCAG 
AGTTTTCCTA 
GGATTTCAGT 
CATCGTGACG 
CCCCTCTACA 
TGGAAGTCTC 
CCAAACCTCT 
TGATGTCTGG 
CACAGCTGCT 
GCTTGGCATC 
GATQATGGCT 
CATTGCTGTC 
GTCCAATGGA 
GAACTTCGTT 
ACTGAGTCGG 
CCCTCTGCTA 
GGCTTGGCAT 
TGGAATACTC 
TTCTTGGAAG 
AAAGCCTTTC 



PCT/US02/12476 



TAA 



I 



MRVGVLMIiIS PnPTDGHGG FliGKKDOIKT KKELIVMKKK HIiGPVEEYQL 
LQCTCEDSYT 
RFTMDLLHSS 
SEIiLSAIERV 
OIITAKCESS 
TVGNIASWS 



EKRDLRHFLK 
NCYIflTAGAL 
IBIQIJOCAYB 



PSCECHUINI. 
RIQGFESVQV 
VFGKAQCSIDI 
FSMIV<a?ATE 
MEDVISIADN 
WKOIPVNKSQ 
LPVSKHGHAQ 
LVNETQDIVT 
WKQIKXSQTS 
LFFWMW1LQI 
DVCHLNWSHG 



r KIWGTFKINE S 



AAVSSFVQNL 
ILNSASVTNW 
LKRGYSYQIK 
VNGPVISTVI 
CQCTHtTSFS 
HTRRICMVUI 

SKPLIAFWP 
GLTWGFGIGT 



YTLPCSSGra 
SVIIRQHPST 
TVIiliREBKYA 
MCPQNTSIPI 
QNYSniEVFL 
ILMSPFVPST 



RGRVIiIGSOQ 



HLQWNDAGCK 
ILCLHEALF 
VFFTHFFYLS 
TQPSNTYKRK 
DDKATIIRV6 
LDSKLRQLLF 

imijTQfvsne 



Seq ID NOi 464 DNA sequence 

Nucleic Acid Accession #: AB035089.1 

Coding sequence I 984S.. 10219 



IFPWKWITY 
P I VGATVDTT 
V6FCLGYQCP 
WIiLVLTKLW 
VXFAUMAFQ 
NFLONKGHYA 



1140 
1200 
1260 
1320 



1680 
1740 
1800 
1860 

1980 



2220 
2280 
2340 
2400 
2460 
2S20 
2SB0 
2640 
2700 



IIiSNISSLSL 360 

ISTLVPPTAIi 420 

FQRSLPBTII 480 

QPHCVFWDFS 540 

VGLGISIGSIi 600 

VMPSGVCTAA €60 

LIISVITIAV 720 



aFFII>CFGIIi 840 



1 11 
I I 
GGGCATGCRO CCATCGGGGA 
CAGTTCTAGT AAAAGGGAGA 
CCAAGAGGAA TTAGGGAGAG 
TTGGTTTGAA AGCATACAGT 



21 
I 

AAATCCATAG 
ACATCAATAT 
AGTTATAAGA 
AAATATQATG 
ATAATATCAT 
TTTTCAAACC 
TCTTGTTCTQ 
TAC3U3TAATT 
TTCCCTATAO 



TGCAGATAAA GC3kAQGAGGA 
TTAGCAATAG 
GGGACAGGGT 
GCAGTGTTGG 
CTCCAACTGT 



GGA TTCCT GQ 
CAAGTGTTCA 
TCAGGATATG 



CAAGCTTTCT 



CXaVTAGATTO 
CCTCCATTCC 
TGGTACCCX3A 
CTGGATGCAG 
TTTCCTCTTC 
CTATTCCCCQ 
ATATATATTG 
GTGAGAAATT 



GATATTTCCT 
AGCCAATGAA 
TATGCAAAAA 
TCCAGTCTCA 
ATTTAGQAAC 
CTAAATTTAA 
AGGGCAAACC 



AATCTAGGAO 
GAAA6T6TAA 
GTTGGTGTAT 
CTTCTTGGAA 



GATCAGCAAG 
TCTGTCECTG 
TTTCTCTGTG 
TTACTGGAQT 
AGGTTGTTCA 
GACATATATT 
ATTGCCACAA 
ACAATGCTGA 
ATATTTCTTA 
TTTCCCATTG 
GTTTATGAAA 
TTTCTGAGTT 
TATGTCCTTT 



GTCCCCTOTA 
CAGGATQAGC 
GTAAATCCAT 
ACTCAGCTGA 



ACCCOG8TTT G 



ACTTCyUVTCA 
ATACAGGAOA 
TCCAGATTGG 



CCTACTCCAA 
GAAGACCATT ATTCATTTTT 
TTCTATCTTT 
GGGAACTTAT 
CCTAW3AAGA 
TUUICACAGCT 
TOTTGTAGAT 



r ACCTCTTAAA 
^ GCATGTCTTG 
C TCCTTTCTCC 
r AAATCTCCCT 



TCrtGOAQAAA TCM3AACTCT ATTCAOQGTC GGTTGGAATQ 



ATGTAT ACAT 
CTOTTATTTG 
GCACATAATA 
GAAAACTTTA 
ACC34ATCTAT 
AGGATTTGTT 
TATCAAGAGA 
CTCraTGGCA 
CTAGCCTGTC 
ATTTCTCATT 
AGTTAQTACC 
CCATAGTCTG 
GTTATCCTGT 
GACATTAGAT 
TCCATTTTTG 
GGAATTCTTT 
TCCATCAAOQ 
CTCATTGAGA 
GGGGTTGAGG 
CATCCAGCCC 
TGACTTTGTG 
ATACCAAGTG 
CACACTTGTG 



AGAAGAAGGA 
AAAAAGAAGG 
TAGATTTGGT 
CAGAGTAGGA 
ACXTACATAT 
CATGAAAACC 
ATCTATATCT 
ATGCTTGAAA 
AGAAACAATA 
CAGCCTTCAT 
AAAGGCATTA 
TTTAATTTCT 
CATAAGTTGG 
ATATATGACA 
TATCACATGC 
TGAACTCATC 
TTTCCTTAAQ 
AAATTCTCTT 

TCCTTTTCTT 
TCTTATAGCG 
ATCTCAGATA 
CCCCATTAGT 
GACTCAAAAC 
AAACAGGCAG 
CTACTTTCAG 
ATGTGCTGAG 
TTGGGGAGAA 



1020 
lOBO 
1140 
1200 
1260 
1320 

1440 

1560 



CAGAATTCTA 1740 
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TGGAGAAGAG TCTGGCATTT CCTCAAMVTG TTAACCTOOA TTTACCATAT OACCCAGCGA 1800 
TTTCATTCaT MGTTTATAC TCAAAAGAAA TGAAGAAATA TGCCATGCAA AAAAATQTAC 1860 
ATGAAAGGTC AOUW3VTCAT TATTCATAAT AGTAAAAGGA TGGAAACAAC ACAAATQTCC 1920 
ATCAACTTAT GATTAAAGAA AATCTGGTCT ATTCATAQAA TGGAATATTA TTCCaCCaCA 1980 
AAAAOSAATG ATGTACTGAT CXXTGCAATG ATGTGGACAA ACTATGAAAA TAACACTAGA 2040 
TTAAAGAAGC CAGTCACAAA AGGACTTACT GTATGATTCX: ATTTACCTGA AATGTTTGGA 2100 
ATAGGCAAAT CCATAGAAAC AGGAGGTAGA TTCCTGGTTT CCAGGGTCTC CAGQAAGGGA 2160 
AGAATOAAQT ACAAGATTTC TTrTGGAGGT AGTGAAATTG TTGTGGAATG AGATCATGAT 2220 
GATGATAGCA CAACTTTGTG AATATAATAA AATCATTOAA TTGTACAQTT GAATTTATGG 2280 
TATATAAATT ATATCTTAAT AAAAAGGGGG TCCACAAAAC AAACAGCCCC CCACTCTGGT 2340 
TSTCAaaOAa ATATTGGATT AAATGGCCTT QGACAACAAC CCCTCTCCCT GOCCACAOAC 2400 
ATTCTTCAQA TTACAAGATA TrCCAlSGGQA AACACTOGAA TSAGTCTGAA G CaUSgTGC T 2460 
AAACAGAAGG ACCATTCSAGA AATGTTGTGA TCCTQACSkGG TCAAGCAATT TATrrTTOGG 2S20 
CTTCATTTTT AAATGTAAAA TTAGAAAGCT GCCATTTAAA ATGGCCCGTC TGTTTCAATT 2S80 
GCrCTTCTCA GTGTCAGCCT GTTAACTCAA TGTGTTAGTC TGTTTTCATG CTGCTGATAA 2S40 
AAACATACCT GAGACTGGCA AGAAAAAGAG GTTTAATTGG GCTTAGAGTT CCACMTGATT 2700 
GGGGAGQCCT CAGAATCACA GTAGGAGGCA AAAGTTATTC TTACATGGTG GCTGCAAGAG 2760 
AAGATGAGGA AGAAGCAAAA QAAGAAACCC CTGATAAACC CATOGGATCT CCTGAGGCTT 2820 
ATTAACTATC ATGAGAATAO CACAAGAAAG ACCGGCCCCC ATGATTCAAT TACCTCTACC 2880 
TGtSQTCXXrrC CAATAACATO TQQAAATTCT GQTAGATACA ATTCAAGTta ASATTTGGGT 2940 
GGGAACACAG CCAAACCATA TCACTCASCA AGGCAOATAA CTTTCTCACT OAGCCTATGC 3000 
AACAQAAAAC CATCTGQGAT GGTTGTAAGG GGCACAGGAA GTGACTGGTA GGATCACTGC 3060 
CAAAGCTGAG CACTCAGGAG AAGGCSVATAG AATCCTATTC TCCATAGTAT GCTATAAGAT 3120 
ACTGAAGTAC ACTTCTTCAC TATCTCTTTG GACTTAGAAT TAGCACTACA TTCCTTGTTA 3180 
TACAQRAAAA TTACTAAGGA AATTCATAGG ATGACAAAAA CTTTCAGAAC TGAAAAACAG 3240 
GAAATGTAAG CTTTTTAGTT CTTTGGTATT CGAAGTATGC CTAAAAGACA ATGCAAAATC 3300 
CAAGAAAAGA ATGOTGGGGT TTTTGTTTGT TTGGTTTTGT TTTTGTTrTA CAGCTGGAGT 3360 
AOAATACAAA GaaATGOAGT TQAAACAAAT GAGAGGAAAT TGGAATTCtA AACSTATTCT 3420 
CATTGGCATT AOAAAOaCSVC CTACATGTAT TTCACATQAO CCGGT6ACTO CTGACTTGCA 3480 
TTCTTATTTT TTOCCTATAG ATTAAAAAGG AGOTACAATG BTAGAACTOT AATCCTQTCC 3S40 
TTTGTCATAA ATTTTCATAT TCATAAAGGT GAGTGTTAGC CCGCTTGTGA AATCTGAAGT 3600 
TGAGTAACTT CAAATACTAA CCACAGAGGG AAAGGCAGCA AGAGGAGAGG CATAAATTTA 3660 
GGATCTCACC CTTCATTCX» CAGACACACA CAGCCTCTCT GCCCACCTCT GCTTCCTCTA 3720 
GGAACACAOQ TAAGAGCTTC AAGCCTCTCC AGCTTAATAA CATGAATTAT TTTTGAGAAT 3780 
AATAATGATA CTGTGTTCTA TATCATGCAT CTCCTGCATT CTGTCT GATT ATATTTTACT 3840 
TATTCTCCCA GAGCAAAATT AAAATACCTA TTTCATCTGA TTTGTCCTTT ATCTAAATTG 3900 
CTTAQTTCCA AfiTAAACCAA GGCACTTTTA GGAACACAGA GGGAGAGTGC CTTGCAGCCA 3360 
QMOAGTCTTG AA6GAGATCT CAG6QA06CA TCTTAACAGC TGGTTGGATG TGATCaVC3«3 4020 
AG6TCICCTG TTABCATTCA TTGTAAAGCC ATCCTACCTA GCTCTAGTGT AACCAGCAAT 4080 
GAAAGAAAGA TAAAGAGGQT OQATTACTTA TTTACAATAG TCTTTAAAAA CGTAQTTTTG 4140 
TAAGCCTTCT AATTAGGACaV TTAATATATT TAATATATGC ACATTGTAGA AAGATTGAAG 4200 
CGTTAAAAAT AAGAGAAAAA CTTTAAATGT CAAAATCTCA CAACCCAGAT ATATCATTTC 4260 
TTTAAGAAAA TTGTACTACA AAATACCATT CCATTTATTA AAGTCATTCT GACaUMAATC 4320 
TGATGCTTTT CCAGOAGTTC CAGATCACAT CGAGTTCACC ATGAATTCAC TCAGTQAAOC 4380 
CAACACX»AO TTCATGTTCG ATCTGTTCCA ACAGTTCAGA AAATCAAAAG AGAACAACAT 4440 
CTTCTATTCC CCTATCAGCA TCACATCAGC ATTAGGGATG GTCCTCTTAG GAGCCAAAGA 4500 
CAACACTQCA CAACAAATTA GCAAGGTAGC TATCAGCATC ATTAOGTTGT CCraTTQCSia 4560 
TTTTTCTCTQ STTCCXSTCGG CTAGCACGCA GATGGTAATA GATGTGGTGG TCTGATGGGT 4620 
ASCACAGGGG GCTGTGCAGG AATTCCCATA ACTGTOAGAC CACTGACTTA AACAGATCTT 4680 
TTGAGTAAAG TTTTCTTOTC CCGCTTCATG TCTCTTCCaa GTTCrTCACT rrGATCAAGT 4740 
CACAGASAAC ACCACAGAAA AAGCTGCAAC ATATCSVTGia AGTCACAGAG CACTCTGATT 4800 
CAGCTTTAGA TCCCTGAACA GtSTCATAGTT TAAACCTGOA ACTTCACKAA AACTAAOAAA 4860 
AGGCCAGTTT TAGGGAAAAT CTTGGACACA AAGATTGAGA CATACAOAGT GGSTTGGCAT 4920 
TTCATaaCAC ATAATTATTA TTCCTCATTT CTGCGTTACT AAAAGACSUST CAGCACTGTA 4980 
OCTCAGAQCa TAGGTCTGGA TCAGGATAGG CTGGQTTCAQ ACTCCAGCTT TGCTCTTCAC 5040 
AAAIOATGAA TAAGAGCAGG ACACAACTGC TCXXSRGTCCC AGTGACCTCA TCCCSGAAAA 5100 
CTAAQGGTAA GAAAAAATCT GACTCAATAC ATGCAAATAC ATGCAAATGT TTACAACAGT S160 
QCCTTGCECA TAAAAGTCAT AATAAATGTT ATTATTATTA TAAAGTAGCT ATAATTATAC 5220 
TAATCATAAT AATGTGAAAA TAATTTAATT TTCATTGAGT CATTAATGJM3 ATTCAGAGGA S280 
ATAAGCACAA OTCCSkAGTAT ATTTTQGAAA ATGAZTGCTA TGGAATATAT TGGTT TAG AG 5340 
CCTTAATAGT GCAAAATGCT TTGCTGGAAQ GTAGAAAGTT CT AGATTTAA ACAGCCSTAG 5400 
GTTCAAAACT TGGOVCTTCT AATTTAT3TC TCTATAAACA GGQTTTTTTT CCCCATTCTC 5460 
TGAGCTTTCT TGTGTTCATC TGAATTGAAC TJUkAQACTTA GAGTTACCCA TGTAAAGTCC 5520 
TTAGCCATGO ACXTGGCATA CACTCTTCTT ACGTGCAGAG AATGACCATC ATGAGGAAAG 5530 
AGCCACAGAT CAGTCSJ^TGT GTCCTACAAG ATAATAGCAC CAACAGGTAT AACAGGGCTT 5640 
CCTQGCATAA TCTATTTAAA ATATCCAACC TTCAACATAC TCXTTATCCTT GATGACTGTT 5700 
AGAAGTGAAA TATGGTCCTT GCCCATAAGG AGCTOAGAGT TTAACTGGGA AGCTAAACXTT 5760 
AACCCTTTAA ACCAACAAGG AGAAAATCXA CTGGTAGACA GCX5CTGCATC TTTAOTTCAa 5820 
AAGAGAAAAG ATTGCAOTAC GTTA(»GCAA GAAGAATTTT CTCGAACAAG TCAAATATAA 5880 
GGTGGATTTT GAAGGQTATT TGAGStGAAA TACAOCAATT ATCAQQQAAT AACMCRRAQ 5940 
GTCCTCAATQ AOACTACCAQ CATTTAGGGA C IGATCTA AC AGACTTA BCA TGGeTTTAGT 6000 
ATTTACATTG ATACAGCAAT TGAATGATCT CCTTTTTTGA TGTTTGAAGG TTQATAGGTC 6060 
AGGAAATQTT CATCACCAGT TTCAAAAGCT TCTGACTGAA TTCAACAAAT CCACTGATQC 6120 
ATATOAGCTG AAGATCGCCA ACAW3CTCTT CGGAGAAAAG AOSTATCAAT TTTTACAGGT 6180 
AATTTCACCT GGCCTACCCA CATTTCATTT GCATCCTGAT GTCTGTGTCT CTGAGTGGCC 6240 
AAATGGAAGA AAGCS^GGCA GATGAGCCTG GCCXSACCCa^ GTCGAGAGCA TTTACTCAQA 6300 
OIGCATTAaC TCCATTTCCA CAACTCTCCC CCACTGGAGT GTCCCAGACX: CCAAOGATAC 6360 
ATCACTGAAQ TGfTGGATTTA GGGATAATCT TOTOAjrAAAA QAGGAGGTTG TGTAATAGAG 6420 
TQAGTAAGAG TAAXAAGTAA TAAGATACCA TCXSATAAACT QOCACTaACT CAGTCACMTA 6480 
OSATACATCT raOTOGGAAA TQTATGACTA ATGGOVTATT ATTGGAATG6 GGAGGCTTGG 6540 
GTGAGTTCCT GAOAATAOTT GAGGAAGTAC CAG6AAATAT TGAATGCACA GGATGAAAGA 6600 
CAAAAACAAA GATCAGAAAC ATCATGGTTA AAATTACTGG AGAGAA6TCT GAQAAQCAAT 6660 
GAATCTCCTT CAGGGAAGCC TGCTCTGCAG TTTGCAAACC ACAGCCTCTT CTGCTTCTGC 6720 
CTTXTGCXaiA GATGATAXTG ACCTTCAGTG ACCTCTTTCT TGTGCCAGCC CACATTCCCC 6780 
TTTTGCATTO CCTACATGAC ACCTGTATAA AAATATCCAT GGACAGGAGA TACTGCATCT 6840 
ATTCavQGGTC TGGATTCAOC TrACTGTTQT TACAAATAAG TAAGTTTGGT AATATATAGT €900 
TACATAAATT ACTCCTAATT CCTACTTCTT CCTTCATATC TCAAAGGAAT ATTTAGATGC 6960 
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CATCAAGAAA TTTTACCS«3A CCAGTGTGGA ATCTACTGAT TTTGCAAATG CTCCAGAAOA 7020' 

AAGTCQAAAG AAGATTAACT CCTGGGTGGA AAGTCAAAOS AATGGTAGGA GAGCCACCCA 7080 

TTATAGAAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGG 7140 

GAACAGGTGT GGGGATTGAG ATGGGTTTGC AGGGAGGGCT GAAGAGGGCA CTCCAGATGA 7200 

AGGATTTGTC CAAATGAATA TGAAGAGAGC CTAGGGGAGC CAAGGAQGAA ATCACAGGAA 7260 

GCCAATTAGA TGGJUUVCACA TCTGGAGAAT TATTTGCTTA TGGCCCTGCA TGACAATAGC 7320 

TTTGTGGATC CCCTGTCTCC GCTCAflAOCT ATTTTGAGAT CATATCCTTT ACTTTAAATC 7380 

AGACTCAAAT TTTTATGATG AATATTTAAT AGAAAACATT AGAAAflCGTC TCTCGTCTCC 7440 

TTTACTAATT GGGAAACAAG CAGCTCTCTQ GTAAATCACC CTTTTGTCTC TGAGCTGGAG 7500 

CTGCCTCGAT CACATCTGTA GCCAATGTGT TCTGCAGGGA TTATCACAGC TCTCTTCCCC 7560 

ATCAAGGGCA AAGAGCTTGA CAAAGTCTCC ATTCTACAGA CATCTTTCTT ACCTCCCACC 7620 

TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGAG TGAATAGGAA GATACCCCCG 7680 

GAAGTAGTGT CTGACAGCAC AGGACATGCG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 

CTAAAAXGCA ATCAGGGCCT CCTTCCTCTG AATGGGGACC CCX3TAGTTAA AAAAAAATAA 7800 

AAGXAQGAAS AGGAGGGAfiG GAGAAAGGAA AGACACATGT TGGAAOAGTA GACAAAATCa 7860 

GTTTATCAGT ATTCCAAATC AGATGATTGO AGACaTTOVT ACACAGAGAA CGTGAACTCC 7920 

TTCTCTATCA CAAGAAQTQA TGTCTCCATC AAGOOTAACT rTATAGGACT QGAGCCTTGA 7980 

AGAAAGCTGC ATCTGGK5AA CCACTGGTCA GTGAGTCTAA CAATTCAAAO ATCaAAGTCA 8040 

GTGAGTCTCA AGCAGGGATT TGGGTCAATA ATTAACGATC AGTCACX»AC ATTTGCAAAG 8100 

CATCTTCCAS ACAAGCCATT TGTAGCTTGT GTAAAAGACT CTTTTATTCT TTCCCTTGCA 8160 

GAAAAAATTA AAAACCTATT TCCTGATGGG ACTATTGGCA ATGATACfiAC ACTOGTTCTT 8220 

GTGAACGCAA TCTATTTCAA AGGGCAGTGG GAGAATAAAT TTAAAAAAGA AAACACTAAA 8280 

GAGGAAAAAT TTTGGCCAAA CAAGGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 

TAATACATGQ AATGTTAAAC ATTTCTOATO GAATGTAACA TGATAAGTAA AAAATAAAAA 8400 

TTOTTCRTGT CTGTTATTTT GTTGTrrTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 

GAAAAACTAT TGTTTCTAAC TCRTGGAATT CCTGGQTTAT TTCTT AGAA G AAGAAGGATG 8520 

TGTTGCTATC TCAATAATAT TATCTTTTTT GTCTTGTGTT TCACGTGTTA TTTGTTGGAC 8580 

ACATrOATTT ATTGCRGAAT ACATACAAAT CTGTACAGAT GATGAGGCAA TACAATTCCT 8640 

TTAATTTTCC CTTGCTGGAG GATGTACAGQ CCAAGGTCCT GOAAATACCA TACAAAGGC3V 8700 

AAGATCTAAG CATGATTGTG CTGCTGCCAA ATGAAATOSA TGGTCTGCAG AAGGTAAGAA 8760 

CTTGCATCTA CAACTCTTCC TTCTACTGCC GGACATTTTT CCAAAGATAC CAAGTTTAAA 8820 

CAAGGTAAAA GCTTATGACC GAKSTTGCXTTC AAAATGATGA AAAATTCTAA ATQAGGAATG 8880 

ATGACTCACC TTCATATTAC AAATATTTGA GCATAGGGCC TGACACAAAC TGAAAGCTTA 8940 

fa - rri - rrGri.T gttiqtttgt ttttattatt attattataa tactttaagc tttagggtac 9000 

ATGTGCAC3UV TGTGCAGGTT AGTTAt3VTAT GTATACATGT GCCATGCTGG TGTGCTGCAC 9060 
CCATTAACTC ATCATTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTCC CCCCTCCCCC 9120 
CACCCCACAA CAGTCCTCAG AGTGTGATGT TACCTTCCTG TGTCCAAGTa TTCTCATTGT 9180 
TCAATTCCCA TCTATGATTT AATTCCATCT ATGGCTrAGT TAATGATTAA TTTATTAGAG 9240 
TTACATGCAT TGGATATCAA TTTGATGATA TTATTATGCA GCAATITAAA CTTOACIGGO 9300 
AGAAATATAT ACCAATGTGA GGAAAOTTTA CAAATAGGCC 6AGTAGAAAA GOGAATACAA 9360 
ATTTAGOAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATOTATACSV 9420 
GRAAAATATG ATCAGCCTAT TAAAAATTGA CACATGTAGT AGGCTGTTGG CACAAGAAAT 9480 
AQTGATACaT AC3«3TTCATT GTGTACAAAA TAATGTAATC ATATTTTACA TGTGTATCAT 9540 
ACAGTTGTAT ACATACATAT GTACACATAT ACATATACGT AAAAACATGA TTCTGTTTTT 9600 
ACATACATQT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGGACTCATA 9660 
TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAAOCT TCTOTATTTC ACATTTATTG 9720 
CCAAAATAAC GAATCTCCAC ATAGTCAATT CATTGTXAAO 6IQTATTA6A GATOGACAGT 9780 
TAGTCATATC AGTrTCTTTT TTCX3VTTTGT ATAGCTIGAA QAQAAACTCA CIGCTGAGAA 9840 
ATTGATGGAA TGGACAAGTT TGCAGAATAT GAGABAQACR TGTGTCGATT TACACTTACC 9900 
TCGGTTCAAA ATGGAAGAGA GCTATGACCT CAAGGACACG TTGRGAACCA TGGGAATGST 9960 
GAATATCTTC AATGGGGATG CAGACCTCTC AGGCATGACC TGGAGCCACG GTCTCTCAGT 10020 
ATCTAAAGTC CTACACRAGG CCTTTGTGGA GGTCACTGAG GAGGGAQTGG AAGCTGCAGC 10080 
TGCCACXGCT GTAGTAGTAG TCGAATTATC ATCTCCTTCA ACTAATGAAG AGTTCTGTTG 10140 
TAATCACCCT TTCCTATTCT TCATAAGGCA AAATAAGACC AACAGCATCC TCTTCTATGG 10200 
CaCATTCTCA TCCCCATAGA TGCAATTAGT CTGTCACTCC ATTTAQAAAA TGTTC ACCTA 10260 
GAGGTGTTCT GGTAAACTGA TTGCTGGC3UI CAACMGATTC TCTTGGCTCR TATTTCTTTT 10320 
CTATCTCATC TTQAIXIATGA TAGTCATOVT CAAGAATTTA ATGATTAAAA TAOCATGCCT 10380 
TTCTCTCTTT CTCITARTAA OCCCACATAT AAATGTACTT TTCXmCCAO AAARATTTCC 10440 
CTTQAGGAAA AATGTCCAAG ATAAGATGAA TCATTTAATA CCGTGTCTTC TAAATTTGAA 10500 
ATATAATTCT GTTTCTGACC TGTmfAAAT GAACCAAACC AAATCATACT TTCTCTTCAA 10560 
ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT GATACXTTAAA TCXTTTCTTAT 10620 
GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCAXCAATA AAATAATGAC ATAAAATCAT 10680 
TTTTGCTTTA CCTGTTTTCT CTCTGQAAAG GGCAAGTGTC CAGTTACACA TAGQAAAGAT 10740 
AATTTAOAGA XATATTAATC ATATATAAAG GAAAATTAAA AAC»GAGTAG TTCATGATGA 10800 
GCCTOQAOTA GAASGCATAT CCCAGAACAG QAGGAGCCTT GTAAAOCACA TAGGAACTTC 10860 
CTATTTTATO CTAAAOGOAT AAGAAACICA TTACSWOacrr TGATGGTTGT TTCTCAAAGA 10920 
GGGGCATAAA ATTATCATAT CCACKTCTAG AAAATACATC TCTGGCTAOB CTGATATCAA 10980 
TGQATQCXSAa GAAAGAACAG TGTGGTTACC ATATATAAAT TAGGAAATCaV TTAGAGTATT 11040 
GGGAGTGGAA ATGGAGAGAA AGAAAGAGCC TGGGGGAATT ATTTAGGAAA TAATAGTTAC 11100 
AGAAAGACAT CTAAGTTGCT GACCTATCTG ACTGGATCGA TGGAAGAATA TCTTGrTTCT lllSO 
GAGAGAAAAA AAGACTTTGG GTTTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 
TCAAATGGAT TTGCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAG AAACAGAATA 11280 
•TGATCTGAAG CTCTAAATTT GTQATATTC» ATATAAATAC TTTAGAGTCA TTGGGATAAA 11340 
TATGQTaGTT OTAGCTAAAA GCAAAAATAA QATACTAGGO AGAAAGQATA AAGTTAGAAG 11400 
AAAGAAGAAT CTAGAATTQA CCTTGAAQTA TATCAGCATG TCTAAAGATC AGGAATTGAT 11460 
CATTTTTATT TTCCAGAAAG TAGCTTTTCT TRGQGTrCCA TATTtACTCC CATAGATTCT 11520 



I I I i I > 

MNSXiSEAHTK FMPOLFQQFR KSKENNIFYS PISITSALGM VLLGAKDNTA QQISKVLHFD 
QVTENTTEKA ATYHVDRSGN VHHQFQKUiT EFNKSTDAYE LKIANiCLPGE KTYQPIflEYL 
DAIKKFYQTS VBSTDFAMAP EESRKKINSW VESQTNEKIK NLFPDGTIGN DTTLVLVNAI 
YFKGQHENKP KKEHTKEBKF WPNKNTYKSV QMMRQVNSFN FALLEDVQAK VLEIPYKGKD 



364 



wo 02/086443 

LSMIVLLPME IDGLQKLEEK LTAEKLMEWT SIiQHURETCV DUII.FRFKMB BSYDLKOTLR 
TMGMVHIFNG DADLSGMTWS HGLSVSKVLH KRFVEVTEEG VEAAAATAW WELSSFSTN 
BEFCOfflPFL FFIRQNKTNS ILFYGRPSSP 

S«q ID NOs 466 DHA sequence 

Mucleic Acid Accession Ss NM_001910.1 

Coding sequence s 50 . • 1240 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
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21 



41 



CCTTCTTTTG CTGCTGQTGC 
GCCCCTCAGQ AGGCATCXMT 
GTTCTGGAAA TCCCATAATT 
QAGTGCX»Aa 6AACCCCTCA 
TOQCTCCCCA C 
COCCTCI6TO 1 



I I I I 

CAAGGGAGAA GCTGCTGGTC GGACTCACAA TGAAAACGCT 
TCCTGGAGCT GGGAGAGGCC CAAGGATCCC TTCACAGGGT 
CCCTCAAGAA GAAGCTGCGG GCACGGAGCC AGCTCTCTGA 
TGGACATGAT CCAGTTCACC GAGTCCTGCT CAATGGACCA 
TCAACTACTT GGATATGGAA TACTTOGGCA CTATCTCCAT 
TCACTOTCAT CTTOGACACT GGCTCCTCCA ACCTCTGQGT 



GATGGCrCAG AACCTGGTGG 
ACGTGGTQCG 
CCTGAATTGG 

GGTGGGAGGC ACTGTTATGT 



CAGOTCAATC 
ACCAAGTCTC 
AGCCAGGCCA 
CCTTGQCTGT 
ACTTGCCGAT 
TGATTTTTGG 
CCAAQCAAGC 
TCTGCTCOGA 



TTTCrCCATT CAGTATGGAA 
TGTGGAAGGA CTAACCGTGG 
GACCTTTGTG GATGCAGAGT 
GGGAGGAGTG ACTCCAGTAT 
GTTTTCTGTC TACATGAGCA 



CCGGGAGCTT 
TTGGCCAGCA 
TTGATGGAAT 
TTGACAACAT 
GTAACCCAGA 
TCTCTGGGAG 
ATAACATCCA 



TTCCCrCATC RCTGGCCCTT CCGACAAGAT TAAGCA6CTG CAAAAGGCCA TTGGGGCAOC 



CTTCACCATT 



TGGGCCCCTC 



CCTGCCTSTC 



GGAGAATATG 
AACGGAGTCC 
ATQCAGTTCT 
TOaATCCTGQ 
OGTGTGGGAC 
TGACAGACCT 



CTGTGGAGTG 
CCTATACCCr 
GCAGCAGTGG 
GGGATGTCTT 
TGGCCCCAGC 
TGAATATGTT 



TGCCAACCTT 



ACATQAOAAT 
ACKCCRCCA 
TGATTATGAA 
GCAGAGATCA 
CACACGGCCA 
GTACCTGGAT 

TAACATCxrrr 



CX33TCATGAT 
AATCAAAAAT 
TGGTATAATA 
OGCCTGTTTA 
CATTCTGAAG 
AAATATACAA 
AGTTGGATTG 
CTGTAAGTCT 



CTTTCAAGGA 
CATTCGACAG 
AGTCCCCTAA 
AGGCTGGGGC 
GGOTTQCAAC 
ACACACACAC 



AACXSTCATGC 
GCCTAOfOX: 
CTTOACaVTCC 
TTTTACTCAG 
GGAGGGGCCT 
ATTCTTTACA 
TTGAATTAAG 
tCACTTCACA 



OGGATGTCAC 
TACTGGACTT 
ACCCrCCAGC 
TCTTTGACCG 
TQTGTCTQTa 
CCTACAAAAA 



TTTCACATTT 
AATCCCTTTG 
TCTACACTGC 
CAAATTCCGA 
TCGGAATTCA 
TTTOTATTAQ 
CTTTCCATCr 



TTGAAATCCC OAGOTGTCAT TTGACATGGT 



CTTOTTQCAT CCT6TCAGGA G 



OGTTATAC»T 
GATTATQAAA 
CWACTCCaCT 
TGCCXavCTCC 
GC3VTTACATC 
AGCATCTCCC 
GATTCAAGCA 
ACAGAGTTTA 
TGTCT6AACT 
AAAATACTTC 
GCTGGIQCCT 



TCATATTTT6 
ATCTCCAAAC 
CAGCCCTGAC 
TCTCTCCAGC 
ATTTTGTCCA 
ATTGTCCCAC 
AGGCCCA7AT 
OCACATTTGA 



CATACACACC 



TATTGATTTT 
ATATGCACAA 
AACCCATCCa 
TCCACATGCT 



CTCTATTGGT 



AAATGTTTGG 
ATTGCATTTA 
ACQTTGCTGG 
ATAAAATGST 
CTGGGTACTT 
AATOTTAAOA 
AGCATTTT 



1380 
1440 
1500 
1560 
1620 
1680 
1740 

1860 
1920 
1980 
2040 
2100 



21 



31 



SI 



I I I I I i 

MKTLLLIiMiV LIiELGEAQOS IiHHVPLRRHP SLKKKLRARS QLSEFWKSHN LDM IQFT ESC 
SMDQSAKEPl. IMYLDMEYFG TISIGSPPQN FTVIFDTGSS NIMVPSVYCT SPACKTHSRP 
QPSQSSTYSQ PGQSFSIQYQ TaSLSGIIGA DQVSVEGLTV VGQQFGESVT EPGQTFVDAE 
FDGILGLQYP SLAVGGVTPV FDNMMAQNIiV DLPMFSVYMS SNPEGOAGSB lalPGGYDHSK 
PSGSLNWVPV TKQAYWQIAL I»I1QVGGTVM FCSEGOQAIV DT6TSUTBP SDKIKQLQHA 
IGAAPVDGBY AVBCAHIiNVM PDVTFTiaJGV PYTLSPTAyr LIiDFVIXaiQP CSSGFQGLDI 
HPPAGPLHIL GDVFIRQFYS VFOKOiNKVa LAPAVP 

Seq ID NOt 468 oath sequence 
Nucleic Acid Accession i: MH_Oiao58.1 
319.. 1575 



TACGCGCTGC C 



TACACCQACA 
GTCAACGTGG 
AGAAAGGGCT 
CCTGATGCCC 
CTCAGAGATS 



CTGGACGCTA 



21 
I 

V GGGGAACGCC 

V CTTCXnCAAC 
GTTCCGCAAT 
GGCCAGCCTC 
CTCTATCTAC 



AACTTCCTTT 
GTGGACGACC 
AAAGTGGACA 
ACCCATGGOA 



AACATTGCCT 



ACAGGGGGTG 
GGAGAGTCCA 
TGGCTGCGAG 
CTCTACACCA 
TGTGAGATGQ 



TGGCTGCTGA 
TCCTCAGCAG 
TCCACAACOQ 
CX:CACCAGCA 
TCGTCTATGG 
AGGTCCGCTT 
TCATCACC3GC 
ACOGCAGCTC 
TCATCGAGGA 
TGGTQACC3GA 
TGGCTCAGCC 
TGGTGCCACG 
AGAAGAGTGG 
AGCCCXSTGGC 
CAQATQGCAA 



CAACIGGAAT 
CCS GGACA TC 
CGACTTTGAC 
CTCAGCCAAC 
GCTCAATCCC 
CTTOGACGQA 
GCT6TC00TC 
CACCOGGGTT 
GGC XXACC TG 
ACACTTTGGC 



31 
I 

ATCGGGGTCA 
ACCAATAATG 
AACCGGTGGG 
TTTGCOGGAC 
ATTGCCAATT 
GCCAGTGACC 
AOCAAATATA 
GATATCTTCr 
AOCTTTGTGQ 
GTCGCOCTGG 



I 



CCTTCTCGGG GGTGGGCACG 
AAGACATCCT 6AG0GATGAG 
GCTCTGTGGC CTGTGTGGAC 
ACGCCTACGG TAATGTGGGC 
TCTCCCGGGG CATTCTGGCG 
CAGGGGGCCG AGGCGTCAGC 



AOOCTGCGGC 



AATGACCAGG 
OGCCTCTTCC 
GGCGACGCCT 



GCCTCTATCT 
AGTTCTCCAT 
AGCTGGAOAT 
GCGTCATCOQ 
TGGAGCCTGA 
TGOACCTCAT 
ATCAQGGCTT 



CASIGCTGGT 
CCQTGATGGC 
GCAAATQAOC 
GCCCTCOCCT 
CTTCTTCAAC 
TAGAQAGCAC 
GGGCCGOGGC 



AGOATCATCQ AOGQGGQCTC 



TAAGQTCGIG 
AGGCTACCTG 
CAGTOTGGAG 



1020 
1080 
1140 
1200 
1260 



365 



10 
15 



WO 02/086443 

GTOCTGGAGA TCCTCTACCC 
ACACCAATGA ATGCATCCAG 
ACACCTATGG AAGCTACAGG 



PCT/US02/12476 



CCCCCACCGC TGCTGCTGCC 
CACCGGTCCT CGTAGATGGA 
CCAGCTGCTG AGCAGGGGTG 
AAGTGGGCTT GTGCTGCTGC 
CCCAAGCCCA TCCATGCACA 
CTGIGCTGGG CACATAGCTG 
ATTCCAGTGG GTCTAATGAC 



AAAGCTATCT GACCTTACAC 
AAATGGGOAT TAAQAATAOA 
GACACrTGGC ACAAAACCTG 
GGGCTTTGTC AACACGTG 



TTCCCATTCG 
TGCCX3GACCA 
GTGGGGACTC 
ACTGCCX3CTG 
GATCTCAATC 
GGACATGAAC 
CTAGACAGTA 
TTACTTAGCT 
TOATCACAGC 
CATATCTTAG 
CTTTAGTGTC 
CAQTCACTTA 
ATCTTGGGGT 
GCACATAGTA 



GACACACTTC 
TGTGCCCTCG 
ACAAGAAGTG 
TCGGCCAGTC 
CTGCTGCCGC 



CAGCGGATGG 
GGGATGTAAA 
AACAATTAGG 



CTGAGTTCAA 
ACTTOTTAGC 
TAGTGTCGAO 
AAGGCTCAAT 



AOGACCCAQC 
AGACAAGCCC 
CAGTCGGGGC 
ACCGGGCCCC 
TGCTGGAGCT 
GGTTAAGGAG 
AGTCCAQCAG 
GGCCTGGGAG 
GAGACTOJTA 
GCTGCCCTGA 
TGCCCAGGGA 
ATCCTGATTC 
CATCCATTAT 



CCCACTGGAG 
GTATGTGTCA 
TACGAGCCCA 
CGCCCCACCA 
GCCACTGCTG 
AjQCTGCGAGC 



CTAGACCCTC 



TGGCGCTTAC 
GGTGGTGTOV 
AGGAACTCAC 
CGCATCTGCA 
ATGTATGTAA 
GCCTCTCACT 



13S0 
1440 
1500 



1740 
1800 
1860 
1920 
19B0 
2040 
2100 
2160 



21 



31 



41 



51 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



I 1 I I . 

MDPEASDliSK GIIiALRDVAA EAGVSKYTGG RGVSVGPII.S SSASDIPCDN EKGPNPLFHN 
RGDGTFVDAA ASAGVDDPHQ HQRQVAIiADF NliDGKVDrVY GiniNQPHRLy U}HSTHGKVK 
FHDIASPKPS MPSPVRTVIT ADFDNDQEIiB IPPNKIAYRS SSAHRIiFRVI HREHGOPIiIB 
ELNPGDALEP EGRGTGGWT DFDGDGMLDL IIiSHGBSMAQ PLSVFHfflJQG POTOWLRVVP 
RTRVGAFARG AKWLYTKKS GAHiRIIDGG SGWiCEMEPV AHFOUiKpEA SSVBVTWPDG 
KHVSRNVASG EMNSVIiEILY PRDEDTLQDP APLETPMIAS SSHSCALETS PYVSTPMEAT 
GAGPTRSAVG ATSPTRMAQP AHGXiSASHRA PAPPPPPLLL PIiPLLIiPLLB LPLLHRSS 

Seq ID KOs 470 DNA sequence 
Nucleic Acid Accession «: AJ279016 
Coding sequence : 1 . . 1 962 



21 



41 



51 



ATGTCCAGGA 



TGTTACCGTT 
AACCCATGTT 
CCCAGCTCAA 



II. 
CCTGCTGCTG CTCTGGTTTC TGCCCATCAC TGAGGGGTCC 
CACTGCAGTC ACCAACTCAG TTCTGCCTCC TGACTATGAC 
CTATGGTGTG GCAGTTACTG ATGTGQACCA TGATGGGGAC 
GTACAATGGA CCCAACCTGG TTCTGAAGTA TGACCGGGCC 
CGCGGXCGAT GAGCGCAGCT CACCCTACTA CXSCGCTGCXSQ 



TTGTTCAAGT 
CGTGGTGTGG 
GGAOOCTACT 
ATTGAAATGG 
CCTGCTOAGG 
CTCAGCAGCA 
CACAACCGGG 
CACCAGCATG 
OTCTATGGCA 



CTGGGGTC3U3 
GTGCCTCGGA 
GCGATGGCAC 



TGCCAATTAC 
CAGTGACCTC 
CAAATATACA 
TATCTTCTGC 
CTTTGTGQAC 



6CCTACGGTA t 



ATCACCGCCO 
CGCAGCTCCT 
ATCGAGGAGC 
GTGACCGACT 

gctcabccgc 
gtgcx:ao6CA 



ACTGOAATGG 
G6GACATOGC 
ACTTTGACAA 
CaVGCCAACXM 



CCCCCACCGC 
CTCACCCAAG 
T6ACCAGGAG 



TCGACGGAGA 



CCOGGTTTGG 



CGAOGCCTTG 
CGGGATGCTG 
CCGGGGCAAT 
GGCCTTTGCC 
GATCATCGAC 



CTCTACCCCC GGQATGAGOA 
TTCTCCCAGC AQGAAAATGG 
GTGTGCCCTC GAQACAAGCC 
AACAAGAAGT GCAGTCGGGQ 
CTCGGCCAGT CACCGGGCCC 
OCTGCTGCCG CTGCTGGAGC 
CIGQQGTOGG TGGTTAAGQA 



GAACGTGGCC 
CACACTTCAG 
CCATTQCATG 
CX3TATGTGTC 
CTACGAGCCC 
CtXSCCCCACC 
TGCCACTGCT 
GAGCTGCGAG 
aGGOAGTGCIG 



GACAATGAGA 
GCTGCGGCCA 
GACTTCAACC 
CTCTATCTGC 
TTCTCCATGC 
CIGGAGATCT 
GTCATCCGTA 
GAGCCTGAGG 
GACCTCATCT 
CAGGGCTTCA 
AGGGGAGCTA 
GGGGGCTCAG 
GAAGCCAOCA 



AGG6AICTAA AGQCCTGGGA 
TAACAATTAG GQAGACTOGT 
CAGACAGGGT CGCTGCCCTG 



GACCCAGCCC 
GACACCAATG 
AACACCTATG 
AACGAGGATG 
ACCCCCACCG 
GCACCGGTCC 
CCCAGCTGCT 
AAAGTGGGOr 
CCCCAAGCCC 



CCTGAGTTCA AATCCTGATT 
AACTTGTTAG CCATCCATTA 
TTAGTGTGGA GATTAGATTA 



AAGGCCAGGC 
ATGGCGCTTA 
AGGTGGTGTC 
CAGGAACTCA 
TCGCATCTGC 
AATGTATGTA 
TGCCXCTCAC 



CCTGTGCTGG 
CATTCCAGTG 
ACTGCACAGG 
CAAAGCTATG 
AAAAT GGGG A 
AGACACTTGG 
TGGGCTTTGT 



ATGGGCCTAA 
GTGCTGGTGT 
GTGATGGCAA 
AAATGAGCAC 
CCTCCCCTQT 
TCTTCAACAA 
GAGAGCACGG 
GCCGGGGCAC 
TGTCCCATGG 
ACAACAACT6 
AGGTCGTGCT 
GCTACCTGTG 
GTGTGGAGGT 
TGAACTCAOT 
CACTGGAGTG 
AATGCATCCA 
GAAGCXACAG 
GCACAGCCTG 
CT6CTGCTGC 
TCQTAGATGQ 
GAGCAGGGQT 
TOTGCTGCTG 
ATCCATGCAC 



CAAOGTGSCC 
AAAGGOCTCT 
TGATGCCCTC 
CAGAGATGTG 
GGGCCCCATC 
CTTCCTTTTC 
GGACGACCCC 
AGTGGACATC 
CCATGGGAAG 
Ca3CACX3GTC 
CATTGCCTAC 
AGACCCCCTC 
AGGGGGTGTQ 
AGAGTCCATG 
GCTGCGAGTG 
CTACACCAAG 1380 



TGAGATGGAG 
GACGTGGCCA 
GCTGOAGATC 
TGGCCAAGGA 
GTTCCCATTC 



AOATCTCAAT 
GGGACATQAA 
CCTAGACAGT 
ATTACTTAGC 
GTGATCACAG 
CCATATCTTA 



1140 
1200 
1260 



TGACCTTACA CCASTCACTX 
TTAAOAATAG AATCTTGGGO 
CACAAAACCT GGCRCATAGT 
CAACAC6 



1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



MSRMLPFI.I.L LWPLPITEGS QRAEPMFTAV THSVLPPDYD SMPTQUnrOV AVTOVDHDOO 
FBIWAGYNG PNI.VI.KYSRA QKRLVKIAVO BRSSPYYAUl DSQOIAIGVT ACDIOGDGRB 
BIYFUiTNHA FSGVATYTDR I>FKFBXiNRWE DlbSOBVHVA HOVASLFAGR SVACVDHKQS 
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GRYSIYIMIY AYGNVOPDAL lEMDPEASDL SRQILALRDV AAEAGVSKYT GGRGVSVGPI 
LSSSASDIFC I}NBHGFMFI<F HHRGDGTFVD AAASAGVDDP HQHGRGVALA DFNRDGKVDI 
VYGNVOIGPHR LYIiQHSTBGK VRFRDIASPR FSMPSPVRTV ITADFDNDQE LEIFFNNIAV 
RSSSANRIiFR VIRRGHGDPL lEELNPGDAL EPEQRGTGGV VTDFDGDCaai DLILSHGESM 
AQP1.SVPRGN QGPinniHIjRV VPRTRFGAFA BGAKWliYTK KSGAHUillD GGSGYLCEME 
PVAHFGLGKD EASSVEVTWP DGKMVSBNVA EGEMNSVI.EI IjYPRSEDTLQ DPAPLECGQQ 
FSQQENGHCM DTNECIQFPP VCPHBKPVCV NTYOSYRCRT MKKCSRGYEP MEDGTACVGT 
LGQSPGPRPT TPTAAAATAA AAAAAOAATA APVI.VDGDLN JjGSWKBSCB PSC 



PCT/US02/12476 



I ID N 



: 4-72 I 



ATGGCGTGTC 
ABCGGCTCCT 
OTTCTGAAGT 
TCACCCTACT 



CACACCA6CT 
OCACCTACAA 
TCCTCCCTGa 



CTGAGCGATQ 
GCCTGTGTGG 
GQTAATGTGG 
GGCATTCTGG 
TTCTCCCACA 
GGAGGAGACC 
TGCCGOCTGG 
CAGAGGGAGG 
TCCAAAAGCC 
GCGCCT 



11 
I 

CGGGAGGACT 
CCCCAGCATC 
ATGACCX3GGC 
ACGOGCTGCG 
ACGGCCGGGA 
CAGCGCAGGT 
CCXXTTGCAGG 
GTCAGGCTTC 
GACTQAGACC 
CGTACACCGA 
AGGTCftACXST 
ACAGAAAGQG 
GCCCTGATGC 
CGCTCAGAGA 
CTGCCTCTCC 
CAGAGGAGGC 
GCTGGAAGGA 
CTGGGGCAGC 
ATTTGGCTGA 
CAGCCCACCC 



21 
I 

CCXa^GCCCGT 
CCCTCCCCAT 
CCAGAAGCGG 
GGACCGGCAG 
GQAGATCTAC 
CCCTTCTGGG 
CCTCCTGGGT 
TCCGGACAGC 
TACCCATGAA 
CAAGTTGTTC 



TCCTCCTCCA 
CTGGTGAACA 
GGGAACGCCA 
TTCCTCAACA 
CTCCACAGAA 
CTGCCTCCAC 
AGGCAGGGAG 



GGATGGGACT GGGTGGGCCC 
GGTACAATGG ACCCAACCTG 
TCGCGGTCGA TOAGCGCAGC 
TCGGGGTCAC AGCCTGCGAC 



CTCTGGACGC 
CCTCATTQAA 
TGTGGCTGCT 
AAGCATTGGT 
AGATGAGGAG 
CGGGCAGTTC 
TGGCXSTGCXX: 
CAAGAACCTA 



AAGTTCCGCA 
GTGGCCAGCC 
TACTCTATCT 
ATGGACCCTG 
GAGGCTGGGG 
GAGATATCTQ 



TTCTTCTGAG 
ATAACCGGTG 
TCTTTGCOGG 
ACATTGCCAA 
AGGCCAGTGA 
TCAGCAAATA 
GCAGAACOGA 



AAGQAAGAAG CAGCAGCTTT 



CATGTTACrA 



CCCCTTGTCa 

ccccaccccc 

CTGATGGCTG 
CTGAGAAGCT 
QAGCTGGGAd 
CTQGGAQAAC 



OTCAGCTAAT 
GAGCCCCAGG 
AGGCTTTGGG 
GGGAGGAAAG 
GTCCCTGGAG 
CTCCCATTTT 
CACAGGAGTO 



CTCTCCCATC CCCTGOTCCC 



ACCAGATQGA 



CTCGCOTGGA 
TTTAGQCTCA 
CTGCAGTTCC 
TCTGCCACTC 
ATCCTCAGCA 
T TCCA CAACC 
GCCTTCATOG 
CTAGCA6AAA 
CX»CATTGCC 
TTCTTGACGC 
CAGGGGGCCC 
ACTGCXTTATT 
TTGTCCTCTQ 
GCCCTGGCTG 



TCACCCAAflT 



CTGOTCCTTC 
ATCATGGTTT 
AAGGCTTGGC 
CACCCTGCCT 
ACATTGTCCT 
AAAQAGTCAA 
ACTTCAACCG 
TCTATCTGCA 
TCTCCATGCC 
TOQASATCrr 
GCTCCATCCT 
AAGQTTTAAQ 
CAGGTCCCCT 
GGAATGCAGG 
GAAATGTGGC 
ACAAAAAGGG 
TACCAGQAAA 



GACACATGQA 
AATGGACCCC 
CGCQTGGCCA 
CAGGCAGAAG 
CCAAGCCACA 
ACAAAOAACA 
CCATCTAGTO 
GCGAGAGATT 
CAACTTCCCC 
TGGGAATCCT 
AAAAGAGGAG 
OGAAGCAGAA 
CAGAGGCAGC 
GATGTCTTTT 
QGATATCTTC 
CACC; 
ATATI 
CTCCTCCTOC 
GTCTATGAGC 
CTCCAGXGCC 
TCTCGCAAGA 
GTGGTCTGCC 
CGTGGGTGTG 
TGATQGCAAA 
AATGAGCACC 



CGTCTOGCTG 
AAATGTAAGG 
GCGCTCAGCA 
GGGCAGGCCA 
CAGCACCT6C 
QACGGASATC 
GCCACCRTGC 
GGQAGAGAOA 
AGCTGCTTGA 
OGGAACTGGG 
GGGAAGATTC 



GCTGAAGCOT 
GGACTTTTCC 
GGTTCXrCTGC 
ACCCAAATCA 
GGAAOAC3VTC 
ACGCTCTGTQ 
TTACGCCTAC 
CCTCTCCXSG 
TACAGAAGGC 
GGAGOGGGAA 
CAOCCWVCTG 
GGTGGAGGAA 
TCTGCAGACT 
TTCTGTCTGC 



1020 
1080 
1140 



r GCCAGGGGGC 1380 

5 TGCACTCAGG 1440 

^ GCTGTATGAC ISOO 
3 AAGGGACTCX9 



CCTGTCCTCC 
CTAGGGGGCC 
TGCGACAATG 
QACGCTGCGG 



OAAAACTAGC 
GCCGCCATGC TGAGCCCGGC 1320 
CCACTGTGGT 
TGTCCAGATG 
CTGCTAGAGA 
CAGGGAGGAG 
CAGCTCTOOa GaOACTOSAG 1620 
1680 
1740 
1800 
1860 
1920 
1980 



TTCTGGACRT GGCX3WW5GCC 
ATGGAGACCA TGAGCXZCAGA 
GCTCCTCTGA GGAGCCTCTG 



GAGGCX3TC3M; CGTGGGCCCC 
AGAATGGGCC TAACTTCCTT 
CCAGTGCTGA ACGTC»TTTA 



GCTCCCTGTG 
ATCCCAGAGA 
GACGACCXrCC 
GTGGACATCG 



CACTCAGCCT 
TCCTGGGGTC 
GCCTGATGAC 
ACCAGCATGG 
TCTATGGCAA 
TCCGCTTCCG 



CTTCAACAAC 



AATCAQAAGG 
GATGAAGAAA 
GCAAAGCCTG 
CCAAAGTGTG 
GCTACAGGGT 
AOGGGCTACG 



ATTGCCTAbC 
TCTTCATCCT 
GGAGGGTrCC 
CAGAAAGOAA 



6CAGCTCCTC 
TSACAGCTGG 
CAGGGCCAGG 



CCCAGAACCC 
CX»ATCACTA 
GGGTCCAATC 



AGGQGCTAOS 
AGAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 



GTCCAATCAC 



<3CTATGG6GT 



GGGTCCAATC 
TACGGGCTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGO 
TACCACAGAA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 
GA6ACCCCCT 



CCAGGAAAAG 
ACTACCAGGA 
AATCACTACP 
GTCCAATCAC 
CGGGCTCCAA 



AGGGGCTACG 
GAAAAGGGGC 
CCAGGAAAAG 



TTGTCCCATG GAGAGTCCAT 
r GGCTGCGAGT 
: TCTACACCAA 
r GTGAGATGGA 



CATCGAG6AG 
GGTGACCGAC 
GGCTCAGCCG 
GGTGCCACGC 
GAAGAGTGGG 
GCCCQTGGCA 



GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 
TACGGGCTCC 
GGGCTACGGG 
AAAGGGGCTA 
CTCAATCCCG 



CGGCCTCTGC 
AAGCGCCACR 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 



3 TQAGGTGGCC AGATGGCAAG 



CTGTCCGTCT 
ACCCGGTTTQ 
GC CCACC TGA 
CACTTTGGCC 
ATGGTGAaCC 



GCTACGGGGT 
AGGGGCTACG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
CTCCAATCRC 
CGGGGTCCAA 
GCGACGCCTT 
ACGGGATGCT 
TCCGGG6CAA 



GTTCTATTCA 
CCAGGGTTCT 
TCTGATCCCC 
CCACAGCTAT 
GCGAGGTGTC 
CTGGAATGGC 
GGACATCGCC 
CTTTGACAAT 
AGCCAACCGC 
TGGGAGGAAC 
GGGTCAGGCC 
GQACTQGGCA 
TATTGCRGGG 
AGATACAAAG 
GGGCTACGGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 



GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGAO 
TACCAGGAAA 
CGTCATCCGT 
GGAGCCTGAO 



GGATCATCGA 



TCAGGGCTTC 
CAGGGSAGCT 
CGQGGQCTCA 



OGAAOGTGGC CAGGGGGGAO 



2160 
2220 
2280 

2400 
2460 
2520 
25S0 
2640 
2700 



3000 
3060 
3120 
3180 
3240 



3420 
3480 
3S40 
3600 
3660 
3720 
3780 
3840 
3900 
3 960 
4020 
4080 
4140 
4200 
4260 
4320 
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ATQAACTCAG TGCTGOAGAT 
CCACTGGAGT GTGGCCAAGG 
GAATGCATCC AGTTCCCATT 
GGAAGCTACA GGTGCCX3GAC 
GGCACAGOCT GCGTGGGTAC 
CCCAAAAAGG AGCTGCAACT 



CCTCTACCXC 
ATTCTCCCAG 
CGTGTGCCCT 
CAACAAGAAG 
TGAGCTAGGC 
TTCCCAAGGC 



CAGGAAAATG 
CGAGACAAGC 
TGCAGTOGGG 
TCTAGGCATA 
ATCTGCACCC 



CAGAAAGCTC CAGGTATTCC AGAAGCXX»A GTGTATGAAC 



I 



ACACaCTTCA GGACCCAGCC 4380 

GCCATTGCAT GGACACCAAT 4440 

CCGTATGTGT CAACACCTAT 45 00 

GCTACQAGCC CAACX3AGQAT 4S60 

CAATGACX3TG GAAACCAAG6 4620 

CCGTCTGGTC CTTTTTCCTG 4680 

CTGCTCCXSUS CACCCTTCTC 4740 
AAGATCAOOA ATAA 



PCT/US02/12476 



SPYYAIiRDRQ 
FPTTFAGLLG 
GVATYTDKLF 
GNVGPOALIE 
GGOPEEADEE 
SKSKLADRNI. 
PHFRAPQ4DP 



LPPLSGROFS SSLGQASPDS RQGERVPVPC CRG6LRPTKE PEPFItUtPKS 180 



MSGDGSTSQI. 
PQPPCyYSVC 
KCKGRHAEPG 
QHLPARBLYD 



LSDEVNVAKG 
GIUlLRDVAA 
CHLGWKDGQF 



VASIiFAGRSV 
BAGVSKyTEG 
KEEAAALVEE 
RQAPQHYPVA 
ALSTTWPGG 



ACVDRKGSGR YSIYIANYAY 240 
FSHTASPSIG I 
QREAGAAGVP RGRVRTALQT 360 
PLVTQLHTHG S 



3 RLAGKLARSV 420 




APIVHLKYHL 
FIiTQGLASSA 
LSSERVNVGV 
SPKFSMPSPV 
GQ6EGLRIRR 
KGKOIVAQSV 
KGPITTRKRG 



DDPHQHGRGV 
HTVITADFDN 
GGPPGPGGQA 
PRTQAPQDTK 



GR6TGGWTO 
JCWLYTICKSG 
rOISVIjBILYP 
GSYRCRTNKK 
PQaiLI.UCRA 



RKRGYGVQSL 
PITTRKROyn 
GLRAPITTRK 
FDODGMLOItl 
AHLRIIDGGS 
ROBDTtK^DPA 



QGAPPCIiAR 
AUDPNRDGK 
DQELEIFFNN 
KVNTGPLMKK 
PHYHKKGLOG 
ATGSNHYQBK 



GQAMSRCAIiR 
ATMPALGGLE 
GNWVLDMAKA 
PVLQVGLGIiA 
DAAASAERRL 
PTHTGSRPyS 
-IPESLMTHSY 
KGKVRFRDIA 



PITTRKRGYG 
OLQGPITTRK 
YQEKGLRGPI 



PHRIiYLQHST H 
LFRCSILARG E 
RGCGNAGQSL AKEPASAIAO 1020 



3 SSSLTAGGRN 960 



RGYGLQSLPO 
TTRKRGYGLQ SLPGKGATGS 
KGPITTRKRG YGLQSLPGKE 
RBHODPLIEB UIPGDAI.EPE 
TRPSAFARGA 



QWJAAPSTLli 



PI.ECGQGFSQ 
GTACVGTBLG 
QKAPGIPEAQ 



SRKTMTWKPH 



8VEVTWP06K MVSRNVASGB 
ECIQFPFVCP RDKPVCVMTY 
PKlCEIiQI.SQG ICTPVWSFFL 



Seq ID NOi 474 DMA c 
Nucleic Acid Accession #i 
Coding sequence! 1..1152 



ATGAGTGCAC 
CAAAACGTTC 
GCTGCTGGCA 
AAGGAAAAAO 
GGATTCGTGG 
GACAACCTTG 



AGAAG6CTCC 
AATGIGGIGT 
CTQGCACCCT 
ATCACAGCOO 
ACACAAGCCC 
GAGTTTTTGG 



CAAGTGGGAC 
CCATGGACCC 
TQAGCACACA 
CTGCTGCTGA 
CAAGACAAAT 
GGTTTCTGAA 

CTCGCTCTCT 



AGATACTGGA 
AGAGA6CAGT 
GAATCTGCTA 
ACTGCCXAGG 
GATCATGAAA 
AGAGTTTCCT 
ASAIGGOGTr 
CAGCATTTCC 



GTACCGCATG CCTCAOCCTC 
GAACAGGTGG AGAaGGTTAA 
ACGGATGTGG CCCCTGTAAG 
TCAAAGCACC TACATQAGG6 
CAOGAOCraG AS(» 
CAAGAACTQT GA 



GATTACCAGC 
CCTGGTCATC 
ATCCAACTTT 
CATCCGTGCC 
ACMCOXCGS 
T6AACCCAGC 



ATCTTTATTG 
CTCCTGCTGA 
AATGAGGCAG 
GACAAAAACT 
CGGTTGAAAA 
CASAAGQTCC 
TCTGOCATCC 
GTACTCTTGG 
AGTACCSVIGG 



ATGAGCTCCG 
GGCACGATAA 
GTGAGCTTGA 



CTTTCCTTAG 
CTCAGACGAG 
GTCACKaoC 



ACTACG6AAA 
ACAAATTGAA 
CTGGCAATAC 
CCAGA6CCAA 



r AAACATTCTC AACAATAAXT ATAAGATTCT C 



1200 
1260 
1320 
1380 
1440 
1500 
1560 



C TAAGTATTTC 180 

GGCCTGGAAC 240 

TAAAGCTCTG 300 

AGGCCAGCAG 360 

GGATAACATA 420 



TTACXMACTC 
TCTTCAGTCA 
TGAAAGCGGT 
AGTCAAGCTC 



75 
80 
85 



11 



21 



31 



51 



I I- I I I I 

MSALFLGVGV RAEEA6ARVQ QHVPSGTDTa DPQSKPLGDH AAGTMDPBSS IFIEDAIKYF 
KBICVSTQin.1. I.U.TDHEAMK GPVAAAEIM MBADBUIKAL DHLHROMIMK DKHHHDRBQQ 
YRHHFUCBFP RblCSEIiEDNI RSIAAIiAOaV QKVBKGTTIA MWSOSIiSIS SGIbTLVatS 
lAPFTESJSL VI.LEP6MEI.G ITAALTCZT8 STMDYGXKHH TQAQAHDLVI KSLDKIiKBVR 
EFLOEHISNF IiSXiAGMTYQI. TROIGKDIRA IiRRARANLQS VPHASASRPR VTEPISABSO 
EQVERVNKPS ILEMSRGVKL TDVAPVSFFL VLDWYIiVYE SKHLKEGAKS BTAEELKKVA 
QELEEKLNIL NNNYKILQAD QEL 



I 476 E 



NM_Q14452.1 



368 



15 
20 
25 
30 
35 



WO 02/086443 

1 1 

ATCGGGACCT CTCCXSAGCRG 
GCCACAGCCA CGATGATCGC 
GCTCAGCCAG AACAGAAGGC 
ACCGGCCAGG TGCTAACCTG 
ACCAACACAA GCXnCCGCQT 
AATGGCATAG ASAAATGCCA 
TTACCTTCTG CTGCCTTQAC 
AACOCTACCT OTOCCCCCCA 
ACAGAGACTG AGGATGTGCG 
TCTAGTGTGA TGAAATGCAA 
AAGCCGGGQA CCAAGGAGAC 
ACCTCACCTT CCCCTGGCAC 
GTCCCTTCCT OCACTTATOT 
TCTGTTAGAC CAAAGOTACT 



PCT/US02/12476 



CAGCACCGCX: 
GGGCTCCCTT 
CTCGAATCTC 
TGACAAGTGT 
CT6CAGCAGT 
TQACTGTAST 
TGACCGAaAA 
TAOtSGTGTGT 
GTGTAAGCAG 
AGCATACACA 
AGACAAOGTC 



CTCGCCTCCT 
CTCCTGCTTG 
ATTGQCACAT 



TGCCCTGTGG 
CAGCXZATGCC 
TGCACTTGCC 
CCTGTGGGTT 



r CGCCCBCCGA 
3 CACCACCACA 
r TGACCGTGCC 
: TGAGCATTGT 
: CAGGCATGAG 



GCAGCCGCAT C 
GATTCCTTAG C 
ACCGCCATGT 1 
CCTATGTCTC 1 
GGACCTTTAC C 
CATGGCCAAT GATTGAQAAA '«3e0 



GTTCCAGTCT 
GAAGAAAGGG 
AOATGTGCCT 



TCCCaVAAGGC 



GACTGTCTGA 
TGTGGCACAC 
CCACGCCCTG 



CAGCAAGCCC CCCACCACAG 
GGCaAGAAGT 
CACAAGCATT 
GTGCTTGTGG 
CCCCGGCAGG 
ACCCAQAACC 
CTTGTAGCAG 
AGIGAGAGQG 



TTGACATCAA 
TGATTGTGGT 
ATCCCAGTGC 
GGGAGAAATG 
CCCAAOTGQG 



TCCCCACyWKf 
GACTCTACAT 
AAGAAGGACA 
GATGACATGC 



TGGAAACTGA 
CCAGCCCCSkA 
ACAAOAACAA 



CX3TQAACAA6 
ACACATOCTG 
CATCAAGGGC 
TGAOCATTTG 
GTGCAGTATC 
CATTGTGQAA 
GATCTACTAC 
AAGCCAGTGG 
TTTCTCCAAT 
CATCOGGGGC 
AAAOaATGTT 
CAAACTAGCT 
CGCGAAACTT 
GGGCTTCTTC 



CAGGAAGQQA 
ACCCTCCCAA 
AAGCTGCTGC 



AGCACATGGA 
CAGAATCCAA 
CAQTCCCTQA 
ACCTTCAGST 



CTCCAGCTCC 
AACCCATGAA 
CrCTTCTGCC 
CAACACAA6C 
AGTCAACCAC 



OGTCCATGQA GQCCACIGGG 960 



TGCAATGGCC 
AAAGATATCT 
GGGTACACAG 



GACAT CCTAO ACAGAACCTA 
TTGTGCTTTT CCTGCTGCTG 
CXJAGOACTCT GAAAAAGGGG 
TGAAGAAATC CATGACTCCA 
ATGGTATCGA TATCCTGAAG 
ATCAGTTTCT TTGCAATGCC 
CCGACCAOOA GOGGGCCTAC 



CAGACCCTCC 



CAGTGTTGCG 
TCCACTTTCT 
AACTAGACCG 
TGGACTCTGT 



GCAGGTACGC 
AAATCCTGAG 
GCTATTCGAA 
TTATAGCCAT 



CmSGAGAAGA 
CTCOOGATGA 
GAGAATTCCG 
GTGGATGAGT 
AGCAGGAACG 



TTCGTGGGCT 



CTCTCCTSVC 
CGGAGCCCCT 
GTTCCTTTAT 
GTGACTTGCA 
TQATTQAAQA 



2 TGCTGTAQ 



GAKSGAAGAC 
GCTTAGCCC3G 
GGTGGAGCCT 

TACCAAAGAA 
GCCTATCTTT 
GATTCXCCAG 
GGAAGCCAGC 



1020 
1080 
1140 
1200 

1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



MGTSPSSSTA 
TGQVLTCDKC 
LPCAALTDRE 
SSVMKCXAYT 
VPSSTYVPKG 
QQGFHRRUIIi 
VLWIWCSI 
LVAAQVGSQW 



11 
I 

lASCSRIARR 
PAGTYVSEHC 
CTCPPGMFQS 
DCLSQNLWI 
MNSTESNS3A 
KUiPSMEATG 
RKSSRTLKKG 
KDIYOFLCNA 
VEKIRQLMED 



21 
I 

ATATKIAGSL 
TNTSLRVCSS 
NATCAPHTVC 
KPGTKETDNV 
SVUPJCVIiSSI 
QEKSSTPIKG 
PRQDPSAIVE 



41 



51 



LLLGFLSTTT 
CPVGTPTRHE 
PVGWOVRKKG 
CGTLFSFSSS 
QEGTVPiajTS 



SPQOKHKGFF VDESEPLLRC 
EliRVIEEIPQ 



TTQLETDKLA 
DSTSSGSSAI, 
AEOKIJ3RLFE 



Seq ID NO: 478 DKA sequence 
Hucleic Acid Accession Hs XM_044533 
Coding sequencei 238.. 2751 



I I 
AQPBQKASNI. IGTYRHVDRA 
NGIEKCHDCS QPCPWPMIEK 
TETEDVHCKQ CARGTFSDVP 
TSPSPGTAIP PRPEHMETHE 
SAR6KEDVHK TLPJTLQWNH 
HKBFDIHEHL PWMIVLFLLL 
TQUKEKWIYY CNGMQIDILK 
AALQHWTIRG PEASIAQLIS 
SPIPSPHAKL ENSALIiTVEP 
KKCTVUiaVR UJPCDLQPIF 
QTUiDSVYSH If DLL 



21 



31 



41 



51 



I. I I I 

AGCCQAGGCT GCSGGGCOGG CGCXX3GCX3GG AGGACTGCG6 TGCCC060GG 
AGGGGCTGAG TTTGCCAGGG CCCACTTGAC CCTGTTTCCX: ACCTCCCX3CC CCCCAGGTCC 
GGAGGCGGGG GCCCCCGGGG CX3ACTCGGGG GCGGACCGCX3 GGQCGGAGCT GCCQCCCGTG 
AGTCCGGCCG AGCCACCTGA GCCCX3AGCCQ CGGGACACCG TCGCTCCTGC TCTCCXSAATG 
CTGCGCACCG CGATGGGCCT GAGGAGCTGG CTCGCCGCCC CATGGGGCGC GCTGCCGCCT 
CGGCCACCGC TGCTGCTGCT CCTGCTGCTG CTGCTCCTGC TGCAGCCGCC QCCTCCGACC 
TGGQCGCTCA GCCOCCGGAT CAGCCTGCCT CTSGGCTCTG AAGAGOGGCC ATTOCZCAGA 
TTGQAAGCro AACACATCTC CAACTACACA GCCCTTCTGC TGAGCBGGBA TCGCA06ACC 



TTCAAGGGCA 
AGOGGCAGTC 
AACATGGAGA 
AAGGGCC3GTT 
CTCTACACTG 
ASCCTfOSCC 



ACCAGOAQCT 
AGGACCCACA 
ACCTGTTCAC 
ACTTCACCCT 
GTCCCTTCGA 
GAACAGTCAG 
CCAOCAAGAC 



GCGCGACTGT CAAAACTACA 
CTGTGGCACA GCAGCCTTCA 
GGCAAGGGAC GAGAAGGGGA 
CCCGAATTTC AAGTCCACTG 
CAGCTTCCAA GGGAATGACC 
CGAGAGCTCC CTCAACTGGC 
OAGCCTGGGC AGCTTGCAAG 



AGAAOAAACA 
TCAAGATCCT 
GCCCCATGTG 
ATGTCCTCCT 
CCCTGGTGGT 
CGGCCATCTC 
TGCAASACCC 



TCCTTCCTCA 
CTGCAGGATG 
GGGGTCTTCA 
ACAATGAAQG 
CAGCAGTGaT 
AACaGTOCCC 
TTCCTCAAGQ 
CCCCAGGCrC 
6ATGTCCTCT 



TCTGCAAGOQ 
AGGOCCAGCT 
TCTTCACGCT 
CTTCCCAGTG 
ATGTGCAGAa 
ACACC6TGAC 



CGATGAGGGT 
GCTOTGCTCA 
GAGCCCCAGC 
GCACAGGGGA 
AGTCTTCAGC 



CGGCCOSACG 
CCCCAGGACT 
ACTACAGAAG 
GGCCrCTACA 
CCCACACCCC 



GGGAAAGGAA GATCAACTCA TCCCTGCAGC 



OCTACCAGOG OaTOOCTGTA C 



TGCTACAGCA 
ATGGCTTCCC 
GGCGTGACAC 

AGGAGGTGAA 
GGCCTGGAGC 
TCCCAOACCG 
GCCGCATOCT 



GCAGTGCAGC 600 
CCTGCCGCTC 660 
TACCTACATC 720 
GGAAGATGGC 780 
TGATGGCGAG 840 
GOGOAGCCAA 900 
AaCrrTTGTQ 960 
CAAGATCEAC 1020 
TQTGTCCOOC 1080 
1140 
1200 
1260 
1320 
1380 



CTTCAACGTG 
CCTTTTCTAT 
CTGTGTCTTC 



i TCATTSAGGA GCTGCAGATC 1 



V AGGCAGTGAG 



GTGCATCACC 
CGTGCTGAAC 
GCTGCTGCAG 
CCACACCTAC 
CGTGGGCCCC 
6CAGAATCTG 



ISOO 
1560 
1620 
1680 
1740 



369 



wo 02/086443 

CTCCTOSACA CCX^CAGGGG GCTGCTGTAT GCGGCCTCAC ACTCOGaCQT AGTCCAGGTG IBOO 

CXMATGGCCft. ACTGC3M3CCT GTACAGGAGC TGTGGGGACT GCCTCCTCGC CCGGGACCCC 1860 

TACTGTGCTT GGAGOSGCTC CABCTGCAAB CAOQTCAQOC TCTACCAGCC TCAGCTGGCC 1920 

ACCAGGCCGT GGATCCAGGA CATCGAGGGA QCCAiSOGCCA AGGACXTTTTG CAGCQCQTCT 1980 

TCGGTTGTGT CCCCGTCTTT TGTACCAAC» GGGGAGAAGC CATGTGAGCA AGTCCAGTTC 2040 

CAGCCCAACA CAGTGAACAC TTTCGCXTTGC CCQCTCCTCT CCAACCTGGC GACCCGACTC 2100 

TGGCTACGCA ACGGGGCCCC CGTCAATGCC TCGGCCTCCr GCCACQTGCT ACCCACTGGQ 2160 

GACCTGCTGC TGGTGGGCAC CCAACAGCTG GGGGAGTTCC AGTGCTGGTC ACTAGAGGAG 2220 

G6CTTCCAGC AGCTGGTAGC CAGCTACTGC CCAQAGGTGG TGQAGGACX3G GGTGGCAGAC 2280 

CAAACAGATG AGGGTGGCAG TGTACCCGTC ATTATCAGCA CATCGCGTGT GAGTGCACCA 2340 

GCTGGTGGCA AGGCX3VGCTG GGGTGCAGAC AGGTCCTACT GGAAGGAGTT CCTGGTGATG 2400 

TGCAOGCTCT TTGTGCTGGC CGTGCTGCTC CCMSTTTTAT TCTTGCTCTA CCGGCACCGG 2460 

AACAGCATGA AAGTCTTCCT GAAGCAGGGG GAATGTGCCA GCGTGCACCC CAAOACCTGC 2520 

CCTGTGGTGC TGCXICCCTGA GACCCX3CCCA CTCAACGGCC TAGGGCCCCC TAGCACCCCG 2580 

CTCGATCACC GAGGGTACCA GTCCCTGTCA GACAGCX:CCC CX3GGGTCCCG AGTCTTCACT 2640 

GAGTCAGAGA AGAGGCCACT CAGCATCCAA GACAGCTTCG TGGAGGTATC CCCRGTGTGC 2700 

CCCCGGCCCC GGGTCCX3CCT TGGCTCGGAG ATCCGTGACT CTGTGGTGTG AGAGCTGACT 2760 

TCCAGAGGAC GCTGCCCTGG CTTCAGGGGC TGTGAATGCT CGGAQAGGGT CAACTGGACC 2820 

TCCCXrrCOSC TCTGCTCTTC GTCGAACAOG ACXSSTGGTGC CCXSGCCCTTG GGAGCCTTGG 2 880 

GGCCAGCIGG CCTGCTGCTC TCCAGTCAAO TAGCGAAGCT CCTACCACCC AGACACCCAA 2 940 

ACAQCCGTGQ CCXX»!QAQGT CCIGGCCAAA TATGGGGGCC TGCCTAGOTT GQTGGAACAG 3000 

TGCTCxrrrAT gtaaactgag cccrrrKyrrt aaaaaacrat tccaaatgtg aaactagaat 3060 

GAOAGGGAAG AGATAGCATQ GCATGCAflCA CACACGGCTG CTCCAGTTCA TGGCCTCCCA 3120 

GGGGTGCTGG GGATGCATCC AAAGTGGTTG TCTGAGACAG AGTTGGAAAC CCTCUCCAAC 31;80 

TGGCCTCTTC ACCTTCCACA TTATCCCGCT GCCACCXjGCT GCCCTGTCTC ACTOCAGATT 3240 

CAGGACCAGC TTGGGCTGCG TGCGTTCTGC CTTGCCAGTC AGCCGAGGAT GTAOTTGTTG 3300 

CTGCCGTCGT CCCACXaVCCT CAGGGACCAG AGGGCTAGGT TGGCACTGCG GCCCTCACCA 3360 

GGTCCTGGGC TCGGACCCAA CTCCTGGACC TTTCCAGCCT GTATC AGGC T GTCGCCACAC 3420 

GAGASGACAG OGCaAGCTCA GGAGAGATTT CXSTGACAATG TACGCCTTTC CCTC3«3AATT 3480 

CAGGGAAGAG ACTGTCGCCT GCCTTCCTCC GTTGTTGCGT GAGAACCOJT GTGCCCCTTC 3540 

CCACCATATC CaCCXTTCGCT CCATCTTTGA ACTCAAACAC GAGGAACTAA CTGC3W:CCrG 3600 

GTCXrrcrCCX: CAGTCCCCAG TTCACCCTCC ATCCCTCACC TTCCTCCACT CTAAGGGATA 3660 

TCAACACTGC CCAGCACAGG GGCCXTGAAT TTATGTGGTT TT TATA CATT TTTTAATAAO 3720 

atgcacttta tgtcattttt taataaagtc tgaagaatta ctgttt 



I I I I I 1 

KLRTAMGUiS WIAAPWGALP PRPFUiIiIiLIi LLLLIiQPPPP TWAIiSPRISL PLGSEERPFL 60 

RFEAEHISNY TALLLSRDGR TtYVGAREAL PAIjSSNI.SFL PGGEYQEIJ.H GADAEXKQQC 120 

SPKGKDPQRD CQHY1KII.LP LSGSHtiFTCG TAAFSPMCTY IMMBHPTLAR DEKtaiVLLED 180 

QKORCProPN FKSTAI.WDG EbYTGTVSSP QGNDFAI8RS QSIAPTKTES SLNWLQDPAF 240 

VASAYIPBSL GSLQGDDDKI YFFFSETGQE PBPFENTIVS RIARICa«3DE GGERVLQQRW 300 

TSFLKAQLLC SRPDDGFPFN VLQDVrn.SP SPQDWRDTLP YQVPTSQWHR GTTEGSAVCV 360 

FTMKDVQRVP SGLYKEVNRE TQQWYTVTHP VPTPRPGACI TNSARERKIN SSUIIjPDRVL 420 

KFLKDHPUffi GQVHSHMLI.I. QPQARYQRVA VHRVPGLHHT YDVI.PLGTGD GRLHKAVSVG 480 

PRVHllEBLQ IFSSGQPVQN LLLDTHRGLIi YAASHSGWQ VPMANCSLYR SOSDCLLARD 540 

PYCAWSGSSC KHVSLYQPQI. ATRPWIQDIE GASAKDLCSA SSWSPSPVP TGEKPCEQVQ 600 

PQPNTViaTLA CFLLSNIATR UILHNGAPVN ASASdVLPT GDLLLVOIQQ LGEPQCWSLE 660 

EGFQQLVASY CPEWEDGVA DQTDEGGSVF VIISTSRVSA PAGQKASHQA DRSYWKEFIiV 720 

MCTLFVLAVL LPVLFULYRH BNSMKVFLKQ GECASVHPKT CPWLPPBTR PLHGLGFPST 780 
PLDHROYaSL SDSPPGSRVF TBSEKRPLSI QDSFVEVSPV CPRPRVRU3S EIHDSW 

Seq ID TSIO'. 480 laiA eetpieacB 

Nucleic Acid Acceeaion «: im_0Q42l7.1 

Coding sequence: SB.. 1092 

1 11 21 31 41 51 

GGCCGOOAGA eTAGCAGIGC CTTGGACCCC AGCTCTCCTC CCCCTTTCTC TCTAAGGATG 60 

GCCCAGAAG6 AGAACTCCTA CCCCTGGCCC TACGGCXGAC AGACGGCTCC ATCTGGCCTG 120 

AGCACCCTGC CCX3U3CGAGT CCTCCGGAAA GAGCCTGTCA CCCCATCTGC ACTTGTCCTC 180 

ATGAGCtJGCT CCAATGTCX» GCCCACAGCT GCXXXTGGCC AGAA GGTGA T GGAGAATAGC 240 

AGTGGGACAC COGACATCTT AACGCGGCAC TTCACAATTG ATGACTTTGA GATTGGGGGT 300 

CCTCTGGGCA AAGGCAAGTT TGGAAACGTG TACTTGGCTC GGGAGAAGAA AAGCCATTTC 360 

AXarroGCGC TCAAGGTCCT CTTCAAGTCC CAGATAGAGA AGGAGGQCGT GGAGCATCAQ 420 

CIGCGCASAG AOATCXSAAAT CCAGGCCCAC CTGCACCATC CCAACATCCT GCGTCTCTAC 480 

AACTATTTTT ATGACOGGA6 GAGGATCTAC TTGATTCTAG AGTATGCCCC CCGCGGGGAG 540 

CTCTACAAGQ AGCTGCAQAA OAGCIGCACA TTTGACX3AGC AGCGAACAGC CACGATCATG 600 

GftGGAGTTGG CAGATGCTCT AATGTACTGC CATGGOAAGA AGGTGATTCA CAGAGACATA 660 

AAGCCAGAAA ATCTGCTCTT AGGGCTCAAG GGAGAGCTGA AGATTGCTGA CTTCGGCTGG 720 

TCTGTGCATG CX3CCCTCCCT GAGGAGQAAG ACAATGTGTG GCACCCI6GA CIACCTGCCC 780 

CCAGAGATGA TrGAGGQGCG CATGCACAAT GAGAAGGrGQ ATCTGTQGTG CATTOGAGIG 840 

CTTTGCTATG AGCTGCTGGT GGGGAACCCA CCCmOAiGA GTGCATCACA CAACQAGACX: 900 

TATCGCSXSCA TCGTCAAGGT GGACXTAAAG TTCCCCQCTT CTOTGOCCJK: GGOAOCCCAG 960 

GACCTCATCT CCAAACTGCT CAGGCATAAC CXX:T0GGAAC GGCTQCOCCT GQOCCAGGTC 1020 

TCAGCCCACC CTTGGGTCCXJ GGCCAACTCT CGGA«3GGTQC TGCCTCO CTC TQCCCTTCAA 1080 

TCTGTOGCCT OATGGTCCCT GTCATTCACT CGGGTGCQTG TGTTTGTATO TCTO TGTAT G 1140 

TATAGGGGAA AGAAGGGATC CCTAACTGrr COCTTATCTQ TTTTCTACCT CCTCCTTTGT 1200 
TTAATAAAGG CTGAAGCTTT TTGT 



370 
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WO 02/086443 

! I I I I I 

MAQKEHSYPW PYGRQTAPSG LSTLPQRVLR KKPVTPSAI.V LMSRSNVQPT AAPGQKVMEN 
SSGTPDILTR HFTIDDFEIC3 RPLGKGKFGM VYLAREKKSH FIVALKVLPK SQIEKEGVEH 
QLRREIEIQA HLHHPNILRl, YNYFVDRRRI YLILEVAPRG EI.YKELQKSC TFDEX3RTATI 
MEELADAIMY CHGKKVIHRD IKPENIiLGI, KGELKIADFG WSVHAPSLRR KTMCGTLDYI. 
PPEMIEGRMH HEKVDLWCI6 VLCYEUjVGli PPFBSASBNE TYRRIVKVDL KFFASVPIGA 
QDIilSKUjRH NPSEaUiPLAQ VSAHPWVRAM SRRVLPPSAL QSVR 

Seq ID NO: 482 DNA sequence 
Nucleic Acid Accession #: AK055663 
Coding sequence: 38. .1423 



PCT/DS02/12476 



I 

AGAACGGCTT 
AAAACCACAA 
CCGAAGGTCC 
GCTTATGTGG 
TTTTGATCTT 
TAGCCCTGTC 
AOTCrrGGCA 
ACAGCCCQAG 
CCTGTTCACG 
TACXIAGCTGQ 
GGGACTTAGC 
AGCATTTGCT 
CACTOCCTCT 
GTACAGTGGG 
ACTCATCAGA 
GACCCTAQGT 
TGAACAAATG 



11 

I 

CCGGCGGGAG 
AGATCCTTTT 
TGGAAGATAC 
TGCAGTTCTA 
TTTAGTTTAA 
TATTCATTTG 
CAGTTGGGAG 



I 

' CCTTATCATG 
' GTTACGGGAA 
TGTAATAAAC 
AGCTTTAACT 
AAT7UVGTTAC 



AT6CTTTCTA 



CAAT6TCCTA 



TAGATATGGA 
TTATAAGGAA 
TGTTTAATCA 
TATGAAACTA 
GCTTTAAATA 
GTTTTGTAGI 



AGTATCTTCC 
CTTTGTATTA 
GCTATAGCTA 
AAAGTCTTAC 
GAGGTATCTA 
TTTGGCTCAT 
GTTCTTGCTC 
TTCAAGQATG 

AAcrrrrcAG 

CCAGTTACAT 
CCTGGOAAAA 
GGTCTCAATC 
GGAATTGGAG 
ACTAATAATA 
TATTGACTCC 
TTTACTCTAA 
TATTTTTQTA 
GGCTTCCTTT 



CTAATAGTAT 
TGACATGTTT 
GGTTTGAAAG 
CTCTCTTTAT 
GAAGATTATT 
TTCGGAATAA 
ATGTTGCAGA 
TTCCCCGAAT 
CATATATGCT 
TTGCCTTGAT 
TCCAGACAAC 
CCTTAGATGG 
TGGCTeOATC 
ATGTGACCAA CAGGCTGTAC 



ATTAAAAGAA 
AGTT GGTA CT 
ACCTTTTGCT 
TCTTAGTCGA 
GAATCCATTT 
CaTTOAAATT 
GACATTTGGC 



41 

I 

GGGACAATTC 
TTTAGACTTG 
TTGATATGTA 
GCCTATACTT 
TGGGTAACAT 
CTGGCTGTAT 
AGTGCAGAAC 
TTTGTGGCTC 
TATGTCTCAG 



TAGCAGCTGA 



AGTTTTAGAA 



ACTGGATTAG G 



CAACTGCA6C 
ATGTGAACCC 
ATGGACACAC 
CAACTCAAGG 
OAATTGGACA 
TTGGCTTCX» 
ATQTTAGATA 
AAATGTATTT 
AGAAAJ 



AATCCCAATG 
TAAAOCTAQT 
AOTTATTCTT 
ACCTTACAQC 
ATTGAGGACT 
ACCAAGACCA 
ATTTATTTAG 
ATAGTACTCT 



GTTTTQATTG 
AATAATTATT 
ACTATGTATC 
aTTATTGGTC 
GTCCGAAATG 
AGAATTCGAC 
ACTCTAGTGT 
TTGTCTGGGC 
OCTCTTTTAA 
ASTCCACCTC 
CTAAACACAC 
AGCATGCTTA 
GGTTTTACAA 
TGATAGACTC 
TAATCCAACT 
TGTTCACATT 
AATCCTCQTA 



ACCTGACCAT 
TGAGGAAACC 
TTGCCTCCAC 
GCTTTTTGGA 
TTTGTTTCAA 
AAGCTGCTAO 
GAATTATTCC 
ATCTTGCIG6 



TGCCACTOTG 
CTTAQTTTTT 
ATGCAGTGGC 
TGCCTCAGCC 



GTGATGTGAC 
ATAOGATATT 
ACATTATTAT 
QTTTTTTGAG 
CTCACTGCAA 



CTTACCTTTA TAAGAGCCAC 




TAACTTAA60 CTQXACTTTA 
ATGQAGTCTC ACTCTGTCGC 
CCTCTGCCTC CTGAOTTCAA 
AGGCACCTGC CACCACGCCC 
C»TGTTGGCC AGGCTGGTCT 
AAGTGCTGGG ATTAGGTGTG 
TAAATATGCT TCTTGAATAA 



AACATTTTTG 900 

GAGATGCCAA 960 

CTACTCTAAC 1020 

CTGTTGCAGC 1080 

AGGGTACTQA 1140 

CAOAATTTTC 1200 

ATCAAGGACT 1320 

ATATACCAAG 1380 

TAACTTATTT 1440 

TTGCATTGAC 1500 

TCATGAAACC 1560 

AATGTTAAAG 1620 

GGTATCTTTG 1680 

TTGATGGAGT 1740 

AGTCTTGCTC 1800 

TTAAGGCTTC 1860 

CCAGGCTGQA 1920 

ATGATTCTCC 1980 

AGCTAATTTT 2040 

TGAACTCCTG 2100 

AGCCACCGCA 2160 
TACACATTTT 



GATTTTTGTT f 



GAATTATGTC TTACAAACTA AAAGCAAAAA 
TCTAGCACAA TTTGAATTTT TAATTATCAA 
ATTTTAQTAC ATTTGTAAAT 



21 



31 



41 



51 



RSLCGIIPGL 
GTMYPMSVYS 
VRIRRDANEQ 
MPUiKGTDDL 



ATGCCSCCGC 
CCGOGGCXiGC 
GCCGTGGGCA 
CGGCCCACTG 



11 21 31 

I I I 

GGGAGCTGA6 CGAGGCOGAO CCGCCCCCGC 
GTAGCGGGCC CCCAGAGCTO GGCATCa^GT 
AGAGCAQCCT CATOGTC3VGC TACACCTGCA 
CGCTGGACAC CTTCTCTGGT AOGTACGTTC 
GGGCTGTGCA COGGGGAQCT GGGGCGGGCX3 
GAGaAGACTG GAGCAOGCCC CGAGGTGGCG 
CAGQCTCTCC COBCCCCGCC CCTGCAQTGC 
TTGAGCTCTG G8ACACAGCQ 



TCCGGGCCCC 
GCGTGCTGGT 
ATGGGTACXX: 
AATCGCCCGT 
TCTOGGCGGG 
CTGGTGCGGC 
AA6TCCTGGT 
TGACCQ 



2280 
2340 
2400 



I I I 

QRSFFGKLLR EFRLVAADRR SWKILLFGVl NLICTGPLLM W 
LFSLMTCLIS YWVTbSKPSP VYSFGFBRLE VIAVFASTVL AQLGALFII.K 
EIHTGRLLVG TPVAIiCFNIiP TMIiSIRNKPF AXVSKAASTS WIiQEHVADLS 
SSIFIjPRMHP FVLIDLAGAF ALCITYMLIE INNYFAVDTA SAIAIAIMTF 
GKVUiQTTPP HVIGQLDKLI REVSTLDGVXi EVRNEHFWTL GPGSLAGSVH 
MVLAHVTNRL YTLVSTIjTVQ IFKDDWIRPA U.S6PVAANV 1 
NPVTSTPAKP SSPPFEFSFN TFGKimiPVI LUITTQTRPYG E 
PGIGATQGLR TGFTNIPSRY GTNMRIGQPR P 

Seq ID NO: 484 DNA sequence 

Nucleic Acid Accession tts PtSNESK predicted 
Coding sequence : 1 . . 900 



GACCCCTCCC 



CGCXjOjCTAC 
GCGGCCGOGT 
AGGGCGCAOA 



CTTCCEAACT 
CXXW TGCGCA 
CTTTGCTACC 

QTGCTGCTGG TOGGCACCCA GGC06ACCTG AGGGAGOATG TCAACSTACT AATTCAGCTG 660 



CGQATACCQA TGTCTTCCTG GCGTGCTTCA GCGTGGTGCA QCCGhGCTCC 



371 



wo 02/086443 

GACCAGGGGG GCCGGGAGGG CCCOGTGCCC CAACCCXAGG CTCAGGQTCT GGCCOAGAAG 
A.TCCQAGCCT GCTGCTACCT TGAGTGCTCA GOCTTQACGC AQAAGAACTT GAAGGAAGTA 
TTTGACTCGG CTATTCTCAG TGCCATTGAG CACAAAGCCC QGCTGGAGAA GAAACTGAAT 
GCCAAAGGTQ TQCGCACCCT CTCCCGCTGC CGCTGGAAQA AGTTCTTCTG CtTCaTTTQA 



PCTAJS02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



I I I I I I 

HPPRELSEAE PPPtiRAPTPF PRRRSAPPEL 6IKCVI.VGD6 AVGKSSLIVS YTCNGYPARy 
RPTALDTFSG TYVQSPVRPR GCXXSAVHRGA GAGVSAGGRR GPRGGDMSRP RGGAGAAQDA 
IiPNSGSPRPA PAVQVLVDGA PVRIELVTOTA GQEDFDRLRS LCYTOTDVFI. ACPSWQPSS 
EQNITEKWLP EIRTHKPQAP VIiLVGTQADL RDDVNVblQL DQGGRBGPVP QPQAQGLAEK 
IRACCYLECS ALTQKNLKEV FDSAII.SAIE HKARLEKKLN AKGVRTLSRC RWXKFFCFV 

Seq ID NO: 48S DNA sequence 

Nucleic Acid Acceaaion XM_063832.2 

Coding sequence: 1..711 



GCCGTGGGCA AGAGCAGCCT 
CXX3CCCACTG CGCTGGACAC 
ATTGAGCTCT GGGACACAGC 
CaSGATACCG ATGTCTTCCT 
ATCACAOAGA AATGOCTQCC 



21 
I 

CSAGGCCGAG 
CCC3VGAGCTG 
CATCGTCAGC 
CTTCTCTGTG 
GGGACAGGAG 
GGCGTGCTTC 
CX3AGATC0GC 
GAGGGAOQAT 
CX»ACCCCAG 
AGCCTTGACG 
GCRCAAAGCC 
CCGCTGGAAG 



31 

1 

CCX3CCCCCGC 
CGCaVTCAAGT 
TACACCTGCA 
CAAGTCCTGG 
GATTTTGACC 
AGCCTGGTGC 
AOSCACAACC 
QTCAAOSTAC 
GCTCAGGGTC 
CAGAAGAACT 



AAGTTCTTCT 



41 

I 

TCCGGGCCCC 
GOrrGCTGGT 
ATGGGTflCCC 
TGGATGGAGC 
GACTTCGTTC 
AGCCCAGCTC 
CCCAGGCGCC 
TAATTCAGCT 
TGGCOGAGAA 
T6AAGGAAGT 
AGAA ACTGA A 
GCTTOGTTTG 



GACCccTccc eo 

GGGCGACGQC 120 

C6CGCGCTAC 180 

TCCGGTGCGC 240 

CXITTTGCTAC 300 

CTTTCAAAAC 360 

TGTGCTGCTG 420 

GGACCAGGGG 480 

6ATCCQAGCC 540 

ATTTGACTCG 600 



1 11 21 31 41 51 

I I I I I I 

MPPREMBAB PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTafGYPARY 
RPTALOTPSV QVLVDOAPVR lELWDTAGQE DFDRLRSLCV PDTDVFIACP SWQPSSPQM 
ITEKWIjPBIR TmiPQAPVU. VGTQADLRDD VHVblQUJQG GREGPVFQPQ AQGLAEKIRA 
CCYLECSAIiT QKNLKEVPDS AIIiSAIEHKA RLEKKLHAKG VRTIiSRCRHR KFFCFV 

Seq ID HO: 488 DNA sequence 

Nucleic Acid Accession «: HM_0143 98.1 

Coding sequence; 64.. 1314 



41 



51 



GGCACCGATT 
ACCATGCCCC 
CACGATGGCA 



CGGOQCCTGC CCQGACTTCS CCGCACGCTG 
GGCAGCTCAG 



CCTCACCAAA 
ACAGTAAAAA 
ATTACCTACA 
GTTACTGAAG 
CCACCAGCTC 
ACTCAACOCA 
ACAAOCG8TC 
AATACCACCC 
CCATCGTCAO 



CAACAGTACA 
CTTTAGCAGC 
TTCCAACAAC 



TTACAGTCGG 
ATACASCTGO 
GTAACCAOAC 



TCAAGACTQG 
GGATACAGCT 
TCGACCCCAA 
TGAATTTTCA 



GGACATAAAA 
AAGATTCATG 
TACCCCAGCA 
AACCCAGGCC 
CCCTAGCTTA 
AACCAOTTCA 
CACCCTTCCA 
TCAACCCACC 
ACCTGCCTCC 
AATTTATCAG 
GATTGTTCAA 
CGCAACGCAA 
GGGCGGATTT 
GGGAGCCTAT 



TTTCCAGAAA 
AAACCTGTCC 
GATGGTCATA 
ACTACAAAAA 
ACACCCAACA 



I I 
CAGAACCTG6 CCCAO CGCCC 
OOTCCCTQGC CQTAATTTTB 
CCAGAGATTA TTCTCAACCT 
AGCAACCAGC TAAGCAAGCA 
TCACCTTTCA AACAGCGGCC 



TCAACCGTCA 
GCAACTTTAT 
CATGCCCCAO 
AOGOTTCCTO 
GTTCTAAACG 
GACAAGGAGT 



ACTCACACAC 
CACTGCCACC 
GCCACACAAC 
CGATAGCACT 



CTTCAAGCCT 
TACACAATTG 
GGTGTCTATA 



GCTTTTACTA 



CAGGCTGGAG 
TGATTCTCCT 
GCTAATTTTT 
CTCTTGACCT 
AGCC3VTTGCG 



AAATCCGCCT 
ATGAAAATAA 
CCCTCAGAGT 
GTCATGTQTG 
TGAAAGATAT 
GTCAACATTT 
TTATAAACXA 
ACCCTTOATC 
T C TGTOTTTT 
CACTTTAAAT 
TACAGTGGCA 
GCTTCAGCTT 
GTATTTTTAT 
CAGGTGATCC 



GATTGGGGCC 
AAGGTGTCAA 
TGGAATTTAG 



ATTTAAGTTC 
AGTGAGCTGT 
GAGATATGTT 



ATGGTTTCAT 
TTGTTTTTGT 
CGATCTOGGC 
CCCGAGTAGC 
TATAGAOQGG 
ACGCACCXCA 



GTGAATCTCA 
TTGACCGTCT 
CAGACAGCA6 
CACCIGCAGG 
TTTGOAAATQ 
ATCGTGGTTG 
TCATCTGGAT 
AGAACTCTTT 
CAAACRATGT 
AGG CAGC ACA 
TTATTTTCTA 
GAATTAACAT 
GTAAC TAATA 
CTTTGCTTTO 
GTAACATACA 
TTTTTQAGAC 
TTATGOCAAC 
TGGGATTACA 
TTTCACCATG 
GCCTCCCAAA 
TTAATCATCA 



GGCCCACCCT 
GAAGCAGACT 
CGGTTTTTTC 
ACTGTGGCAC 
CATTTACCAA 
CAGATCCAGA 
TCGGGCATTC 



TGGATGAGTG 
GTCTCTGCCT 
ACCAGRGAAT 
CATCCCTTCC 
AAACCACCAT 
TCAATTTCTA 
aXTTCCTTTA 
AATATATGTA 
CTACTGTGTG 
TTATCAAATG 
TATTCCTGGT 
GGAGTTTCAC 
CTCCGCCTCC 
GGCACACACT 
TTGGCCAGAC 
GTGCTGGGAT 
AAAAGAACAA 



AGCTCCTCCA 420 
CACCATCACC 480 
TGGGAACAOC S40 
GCACAAAAGC 600 
AGCTGCCCAC 660 
TGCACCTCAG 720 
CTGTATAAAA 
ACCTOGGAGA 
CCGAAAATCC 
GGATGAAGAA 
GACAGTTTAC 
CTTCAAGTGC 
CGATGTCCAA 
CrCGTCTGAC 
TATGGGTATG 
CTAATTGTTO 
AGGATGQATO 
CTTCTATTCA 
AATACTTTTT 
GAATATTTTA 
AAGTAGAATA 
TGCATTGAAG 
GACTTTCAGT 
GTAGCACTTA 
TCTTGTCACC 
CGGOTTCAAa 
ACCAOGCCTG 
TGCrCTTGAA 
TACAGGCATQ 
CATAXCTCAG 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
19S0 
2040 
2100 
2160 



372 
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GTTGTCTAAG TGTTTTTATG TAAAA.CCAAC AAAAAQAACA AATCAGCTTA TATTTTTTAT 2220 

CTTGATGACT CCTGCTCCAG AATTGCTAGA CTAAGAATTA GGTGGCTACA GATGGTAGAA 2280 

CTAAACAATA AGCAAGAGAC AATARTAATG GCCCTTAATT ATTAACAAAG TGCCAGAGTC 2340 

TAGGCTAAGC ACTTTATCTA TATCTCATTT CATTCTCACA ACTTATAAGT GAATOAGTAA 2400 

ACTOAOACTT AA6GGAACTG AATCACTTAA ATGTCACCTG GCTAACTGAT GGCAGAGCCA 2460 

GAGCTTOAAT •TCATGTTGGT CIOACATCAA GGTCTTTGGT CTTCTCXXTTA CACCAAOTTA 252 0 

CX:TACAAGAA CAATGACACC ACACTCTGCC TGAAGGCTCA CACCTCATAC CAGCATACGC 2S80 

TCACCTTACR GGGAAATGGG TTTATCCSWK ATCATQAGAC ATTAGGOTAG ATOAAAGGAG 2640 

AGCTTTGCAG ATAACAAAAT AGCCTATCCT TAATAAATCC TCCACTCTCT GGAAGGAGAC 2700 

TGAGGGGCTT TGTAAAACAT TAGTCAGTTG CTCATTTTTA TGGGATTGCr TAGCTGOGCT 2760 

GTAAAGATGA AGGCATCAAA TAAACTCAAA GTATTTTTAA ATTTTTTTGA TAATAGAGAA 2820 

ACTTOGCTAA CCAACTGTTC TTTCTTGAGT GTATAGCCCC ATCTTGTGGT AACTTGCTGC 2880 

TTCTGCACTT CATATCCATA TTTCCTATTG TTCACTTTAT TCTGTAGAGC AGCCTGCCAA 2940 

GAATTTTATT TCTGCTGTTT TTTTTGCTGC TAAAGAAAGG AACTAAGTCA GGATGTTAAC 3000 

AGAAAAGTCC ACATAACCCT AGAATTCTTA GTCAAGGAAT AATTCAAGTC AGCCTAGAGA 3060 

CCATGTTGAC TTTCCTCATQ TOTTTCCTTA TGACTCAGTA A6TTGGCAAG GT CCTOA CTT 3120 
TAGTCTTAAT AAAACATTGA ATTGTAGTAA AGGTTTTTaC AATAAAAACT TACTTTGG 



1 11 21 31 41 SI 

i I I I I I 

MVRQIiSAAAA LPASIAVILH DOSOMRAKAF PETRDYSQPT AAATVQDIKK PVQQPAKQAP 60 

HQTLAARFMD 6HITFQTAAT VKIPTTTPAT TKNTATTSPI TYTLVTTQAT PNNSHTAPPV 120 

TEVTVQPSLA PYSLPPTITP PAHTAGTSSS TVSHTTGNTT QPSNQTTLPA TLSIAUUCST 180 

TGQKPDQPTH APGTTAAAHK TTRTAAPAST VPGPTIAPQP SSVKTGIYQV LNGSRIjClKA 240 

EMGIQUVQD KESVFSPHRY FNIDPNATQA -SGNCGTRKSN LUJJPQGGFV NLTFTKDEES 300 

YYISEVGAYL TVSDPETVYQ GIKHAWMFQ TAVGHSFKCV SEQSLQLSAH IflVKTTDVQI. 360 
QAFDFEDDHF GNVDECSSDY TIVLPVIQAI WGLCLMGMG VYKIRLRCQS SGYQRI 

Seq ID NO: 490 DMA sequence 

Nucleic Acid AcceGsion ft: NM_005409.3 

Coding sequence: 94.. 378 

1 11 21 31 41 SI 

TTCCTTTCAT GTTCAGCATT TCTACTCCTT CCAAGAAGAG CAGCAAAGCT GAAOTAGCAG 60 

CAACAGCACC AGCAGCAACA GCAAAAAACA AACATGAGTG TGAAGGGCAT GGCTATAOCC 120 

TTGGCTGTQA TATTGTGTGC TACA6TTGTT CAAGGCTTCC CCATGTTCAA AAGASOACGC 180 

TGTCTTTGCA TAGGCCCTGG GGTAAAAGCA GTGAAAGTGQ CAGATATTGA GAAAGCCTCC 240 

ATAATGTACC CaUVGTAACAA CTGTGACAAA ATAGAAGTGA TTATTACCCT GAAAGAAAAT 300 

AAAGGACAAC GATGCCTAAA TCCC3VAATCG AAGCAAGCAA GGCTTATAAT CAAAAAAGTT '360 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTC CTGGAAAAGG GCATCTGAAA 420 

AACCTAOAAC AAGTTTAACT GTGACTACTG AAATGACAAG AATTCTACAG TAGGAAACTG 480 

AOACTTTTCT ATOaTTTTOT QACTTTCAAC TTTTGTACAG TTATGTGAAG GATGAAAGGT S40 

GGGTGRAAGG ACCRAAARCA GAAATACA6T CTTCCTOAAT GAATQACAAT CAOAATTCCA 600 

CTGCCCRAAG GAGTCCAGCA ATTAAATGGA TTTCTAGGAA AAGCTACCTT AAQAAAGGCT 660 

GGTTACCATC GGAGTTTACA AAGTGCTTTC ACGTTCTTAC TTGTTGTATT ATACATTCAT 720 

GCATTTCTAG GCTAGAGAAC CTTCIftfiATT TGATGCTTAC A ACTAT TCTG TTGTGACTAT 780 

GAGAACATTT CTGTCTCTAO AAGTTATCTQ TCTGTATTGA TCTTTATGCT ATATTACTAT 840 

CTGTGGTTAC AGTOGAGACA TTQACaTTAT TACTGGAGTC AAGCCCTTAT AAGT CAAA AG 900 

CATCTATGTG TCGTAAAGCA TTCCTCAAAC ATTTTTTCAT GCAAATACAC ACTTCTTTCC 960 

CCAAATATCA TGTAGCACAT CAATATGTAO GGAAACATTC TTATGCATCA TTTGGTTTGT 1020 

TTTATAACCA ATTCATTAAA TQTAATTCAT AAAATGTACT ATGAAAAAAA TTATACGCTA 1080 

TGGGATACTG GCAACAGTGC ACATATTTCA TAACCAAATT AGCAGCACOG OTCTTAATTT 1140 

GATGTTTTTC AACTTTTATT CATTGA6ATG TTTTGAAGCA ATTAGGATAT GTGTGTTTAC 1200 

TGTACrrrM QTTTTGATCC OrrraTATAA ATOATAGCAA TATCTTGGAC ACATTTGAAA 1260 

TACAAAATOT TTTTOTCTAC CAAAGAAAAA TtJTFGAAAAA TAAGCAAATO TATACCTAOC 1320 

AATCACTTTT ACTTTTTGTA ATTCTGTCTC TTAGAAAAAT ACATA ATCIA ATCaATTTCT 1380 

TTGTTCATGC CTATATACTG TAAAATTTAG GTATA CTCAA GACTAOTTTA AAGAATCAAA 1440 
GTCATTTTTT TCTCTAATAA ACTACCACAA CCTTTCTTTT TTAAAAAAAA AAA 



I I I I I I 

MSVKGMAIAI. AVILCATWQ GPPMPKRGRC LCIGPGVKAV KVADISKASI MYPSHNODKI 
EVIITI>KENK GQRdiNPKSK QARI.IIKKVE RXNF 

Seq ID NO: 492 DHA sequence 

Nucleic Acid Accession »s MM_000577.1 

Coding sequence: 41..520 

1 11 21 31 41 51 

I I t { I I 

GGCACGAGGG GAAGACCTCC TQTCCTATCA GGCCCTCCCC ATGGCTTTAG AGAOOATCTG 

COGACXXrrCT gggagaaaat ccagcaagat gcaagccttc agaatctggg atgttaacca 

GAAGACCTTC TATCTGAGGA ACAACCAACT AGTTGCCGGA TACTTC3CRAG GACCAAATGT 
CAATTTAQAA GAAAAGATAQ ATGTGGTACC CATTGAGCCT CATGCTCTGT TCTTGGGAAT 
CCATGGAGGG AAGATGTGCC TGTCCTGTGT CAAGTCTGGT GATGAGACCA OACTCCAGCT 
GGAGGCAGTT AACATCACT6 ACCTGAGCGA GAACAGAAAG CAGGACAAGC GCTTCGCCTT 
CATCOQCTCA QACaunGaCC CCACCACCAG TTTTGAGTCT GCCGCCTGCC CCGGTTGGTT 
CCTCTCCRCA QCGUVIGQAAS CTOACCAGCC CGTCAGCCTC ACCAATATQC CTGACGAAGG 
C8ICATGGTC ACCAAATTCT ACTTCCAGGA GGACGAGTAG TACTGCCTAG GCCTGCCTGT 
TCtXaTTCTT OCATOGCAAa OACTaCAGGG ACTGCCAGTC CCCX:tGCCCC AGGGCTCCCG 



373 



15 
20 



WO 02/086443 

GCTATGGGQG CaCTGAOOAC 
CCTGGTCACA GGRCTCTGOC 
GTCTTTCTAA TGTGTGAATC 
TCTGCRTTCA GGATCAAACC 

CTTCxrrccxrr caticcacci 

ACCAAGTGGC TCCCACy^CCC 
TTTAAGGGTT TGTGGAAAAT 
GAAGGAGAGC CCTTCATTTG 
ATTCCTGCAT TTGTGAAATQ 
TTTTTTTCTG AT GTCCC AAC 
ATTTTTTTTT TCCTTTTAAA 
CCCAGCCTCC AAGCTCCATC 
TCCTATCAGA AGTTTCTCAG 
TCTTCCTCTG CTGAAGGAAT 
ACTTGTATQA AAGATGGCTG 
GAGCAGGAAA CATGACTCGT 
CTCTTGGCAG GTACTCAGCG 
CTGTGACTTC AGCTCTGTTT 



PCT/US02/12476 



ccGACCACxrr 

TCCCATGCCC 
TGTTTTACAA 
GAAAATTAGG 
GAGATTATQT 
ATGGTGAAAG 
TTGTAAAAAT 
ACACTTCCAT 
TCCaCTCCAG 
CTCCCAAGGC 
AAATTGCTCC 
TGCCTCTGCX: 



CTGRCCAGCC 
AGCCXXTTGCA 
GCCCAACCTG 



TCCATGCICC 
CAAAGCCCTT 
CTCTCCTCTT 
AGGCCACTTG 
ACCAGTCCAT 



TAAGTGGTAO 
TAAAAGTTAT 
AATCTGGACT 
ATTTTTTACA 
TCTGAGCAAA 
TTGACATTGT 



A GAGG CTGAG 
CTTTTCCCTT 
GGTACTATOT 
CCTCTGTCCA 



GTCACAACAA 
CTCCAGAATG 
CCATGTCGCC 
GCCACTGCCT 
ATGACCCCCA 
GAGGGAGGTT 
CAGTCCCCGT 



GGTCCCTGCA 
GTATATGrre 
TCTTGAAAAT 
AAAAAA 



CTTTTTCTTC 
TAGCCCCATA 
GGC ACTGC TG 
GTACTTTACC 
QGGGQTTCTT 
GCACTTGGAG 
GAGCTCTGCA 
CCTAGCCTOS 



1080 
1140 
1200 
1260 
1320 
1380 
1440 



TGTGGCTCCT C 
AGAGCTTCTG G 
ACCAGGCTGG G 
GGGCC3UM3CA C 
GGTGC3UVAGT 1 
GCCTAAAAAA AAAAAAAAAA 1680 



r TCCCTACTTC 1620 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 I I I I I . 

MAIiETICRPS GRKSSKMQAF RIWDVNQKTF YLKNNQLVAG YLQGPMVNIjE F" 
HALFLGIHGO KMCLSCVKSG DETRLQLEAV NITDLSENRK QDKRFAPIES DSGPTTSFES 
AACPGWFLCT AMEADQPVSL TNMPDEGVMV TKFYFQBDE 

Seq ID NOi 494 DHA sequence 

Nucleic Acid Accesoion #i 1JM_002081.1 

Coding sequence: 222.. 1898 



CGGGACCTTG 
AGAGGCGC6G 
GCTGGTGGCT 



GCGAGCGTTC 
GTCTCCGCCT 
GCTCTGCCCT 
GCX3GGTGGCC 



GGACCTCGCA 
CCTOSGCCGC 
TCGCGGGCGG 



I 



I 



GCGAGGT6CC 
CXTTGCTGCAC 
CCGOSC 



GCCX3CAGCX3C 
QAGGTCOGCC 
ATCTOGGGTG 



CGCCGCCTCT GGACCGCGAG CCGCGCGC6C 
GAACTGCGCA GGACCXS3GCC AGGATCCGAG 
CCGGCCCCGC CATGGAGCTC OGGGCCCXlAa 
TGGTOSCCTG CGCCXXSCGGG GACCCSGCCA 
AGATCTACGG AGCCAAGGGC TTCAGCCTGA 
GATCTGTCCC CaMJGGCTACA 



CAGCGAGATO QAGQAGAACC TGGCCAACOO CA6CCAT6CC GAGCTGOAOA 480 



TCGATGACCA CTTCCAGCAC 
CCGGCQCCTT CGGAGAQCTa 
AGCTGCGCCT GTACTACCGC 
GGGCCCGCCr GCTCGAGCGC 
ACTACCTGGA CTGCCTGGGC 
GAGAGCTGCG CCTGCGGGCC 
TGGGCGTGGC CAGCGACGTG 
CGAGAGCTGT CaiGAAGCTG 
CTATTGCCQA 
GAGGAACCTC 
GGAGAGTGTC 
CAACAGGGAC 
CCAGGGCCCT 
ACCTTCAGGC 
GGACTTCTGG 
CAGTOATGAC 



CTGCTGAACG 
TACXCQCAGA 
GGTGCCAACC 
CTCTTCAAGC 



CSWGCT TOCCACCCAG CraCGCAGCT 540 



ACTCGGAGCG 6ACGCTGCA6 GCCACXITTCC 
ACGCGAGGGC CTTCCGGGAC CTGTACTCAG 
TGCACCTGGA GGAGACX3CTG GCCGAGTTCT 
AGCTGCACCC CCAGCTGCTG CTGCCTGATG 
AGGOGCTGCG GCCCTTCGGG GAGGCCCCGA 
TCOTGGCrGC TCGCTCCTTT GTGCAGGGCC 
TGGCTCAGOT CCCCCTGGGC CCGGAGTGCT 



AGGTCAACCC 



AATGTGCTCA 
CTGGACTCCA 
ATCGGCAGCG 
ACGCTCACGG 



AGGGCIOCCT TGCCAACCAG GCOGACCTGG 
TGGTGCTCAT CAC06ACAAO TTCTGGGGTA 



ACGCTGGAGA 
ATCAGCCTCC 
OGCTGCTGGA 
AGGTCATGGa TGACGQCCTO GCCAACOVOA 
CATGACCATC CGGCAGCAGA 
CTACAACGGC AACGACGTGG 
CGGTGATGGC TGTCTGGATQ 



AGAAGCGGCG 
AGCTGGTCTC 
CAGGGACACr 



GGCTGCCAGC TGCXXCCAGC 
TACAGTAGCC AGGCCCCGGT 
ACTGACmG CCAAAAATAC 
CAGGGAGGGC 
CAGGCCTGGC CTGGCCTGCC 



CAGC3TCAGCT 
TCOGGCTGCC 
CTACAGAGGA 
CG CCTCCTCC 
CCTCCAGAGA 



TAGCCCTCCC 
GGCCTCAAAG 
CACTGGGACT 
AGCCCCGCAC 
TGCATGATGC 



CAGGGCTCAG 
CCCCGCACTG 
GCACGGGGAC 
TGTCCTTGTT 
CACCTTGGAC 



OACTGTCCTC 
ACCCACCTGC 
TGGGCTCTGC 
GGTCCAGGGC 
TCCTC3VCCCA 
AGTGACCCTC 
CACACGGGAA 



CCCAQCTCCC 
CAACCCGCTG 
CCCAGCAQAG 

CCTCCCCTCA 
AGCX5TCTGCA 
CCACAGACCC 
GCTTCTGCTG 
CAATGTGGGC 
TGTTGGAGGA 
GATCAGGAAC 
GGCTGTCACC 
TGCCTAG6TC 
AAGGGCrm 
TGTTCGCTCC 
TCCXaTCACT 



TCAACAACCC 
TCATGCAGCT 
ACrrCCAGGA 
ACCTCTGCXSG 
CCCTCCCAGG 
CCCCX3ACCTT 
GGCGGTAACT 
AACACAGACG 
CGGCGGCTCT 
TTTCTGCCTT 
GCCATGTATT 
TGCACCGCCG 
GAGCKCACAQ 
CCCACCAGCC 
GGTGTCCGCC 
GCGCAGGCTG 
GGGTGACGCC 
TGCAGTGBGG 
GAGGAGGGGA 
TGCCCCTOGC 



TGAAGCCAAG 
GTQCAGTGAG 
CAGAGGCCGG 
CGAGGTGGAG 
GAAGATCATG 
OGCCAGTGAC 



GCCCAGCTCC 
AAGATGGCCC 
TACCTCCCXB 
GTGGACATCA 
ACKAACCGGC 



CCTGTCAGAS 
CCTCCTGCCC 
GCCCCAAGGC 
ATATTTAATT 



TTAATTTTGT 
TCAGGGACCT 
CAGAAGCAGC 
CQAGCCTOTG 
AGCCCTGGCC 
ATCCAGGGTC 



AGCAGGAAGA 
CAGGAAOGAC 
CTCCTCCTCT 
CCCAGGGACA 
CACCTCAGCC 
AGGCGCAGAG 



CAGGGGCACC 



1020 
1060 
1140 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



1980 
2040 
2100 



TGAGACAGCA 
GGCCCTCCAT 
AGCTGGGCCC 
ACACAGGGCT 



CAGGGtX:TCC CTGTTCACGS 



CCTTCCTCCC 
CACCCCCCAQ 
TGGCAGAGCC 
CCCCaCCTCC 
CCACTGCTGA 
GOGCAGATGA 
AAAGGCCCAG 
CACAGGGCRG 
CCAGGACCCG 
TOACACSUSGT 



2280 
2340 
2400 
2460 



2820 
2880 
2940 
3000 



CACIGAGGCC ATCAGQGCCX: TOaXXAQQC 3120 



374 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
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CTGGACGGGC CCTCCTTCCC 
TGTGGTGTTG GGAAGGGGTC 
TCCTQAACCG ACTGACCCTG 
CGCACAGTGG ACGGAGGTCC 
CTGACTTTAG ATGTTTTGGG 
CCCTGCCAGT GCC3VGGGTGG 
CAGCACTCCC GCI6CACACA 
TCTCTGGAAG GGGCAGCCCT 

ccTTCxrrccsi caackstoccc 

TCCTTCSTATG AATAAAAOGC 



TCCTQTGCCC 
CTGCSW3GGGG 
AGGAGGCCGC 



CAGCTGCCAG 
AGGAGGACTT 
TTAGTGCTGC 
GTCASGTCCC 
CCCAACACAG 



GCTGOGGACT CTGGCACAGT GATGCOGGGC 



GTGGCCCTGG GGAGGGGTGQ 3180 
GGAGGGTCTG G 
TTTGCTTTTC P 
CATGGCTTGT I 
GCAAGTCCAC C 



PCT/US02/12476 



GGGGCAGCTG 
ATCACCGTCC 
TCTCTGGAAC 
CCCATAATAA 
GCCAGGACAG 



GA6TGGTCAC 
CCACOGCTCA 
TGQAKACCTA 



GTGTCAGCGG 6I6ACGTGTG 1 



3420 
3460 
3S40 
3600 
3660 



I 

MELRARGWWI, 
ICPQGYTCCT 
TLQATPPQAP 
QUiLPDDYLD 
PU3PECSRAV 
TDKFWGTSGV 
RGKXAPRERP 
RGRYLPBVMG 



i 



I 



GEIiYTQWARA 
CLGKQAEALR 
MKLVYCAHCL 



LCAAAAI.VAC ABfflJPASKSR SCQEVRQIYO 
SHAELETALR DSSRVLQAML 
FROLYSELRIi YYRGAHIiHIiB 
PFGEAPRELR LRATRAFVAA 
GVPGARPCPD YCRNVliKGCL 
LAEAIMAU^ NRDTLTAKVI 
EAKAQI.RDVQ DFWISIiPGTL 
EVGVOITKFD MTIRQQIMQL 



Seq ID NO: 496 DKA sequence 

Nucleic Acid Accession ft: NM_0016S0.2 

Coding sequence: 40.1011 



SI 



AIOSFSLSDVP QABISGEiajl 
ATQI.RSFDDH FQHLLNDSER 
ETLAEFWARIi IiERIiFKQIiHF 
RSFVQGLGVA SDWRKVAQV 
ANQAOLDAEH RKLLDSMVLI 
QGGGHPKWNP QGPGPEEKRR 



KIHTNRUISA YNGNDVDFQD 



r GGGGAAG6CA TGAGTGACAG 



GGGGTCTG6A 
TTTGTTCTCC 
GTCGACATGG 
TTTGGCCATA 
AGGAAGATCA 
ATTGGAGCAG 
ACCATGGTTC 
TTTCAATTGG 



TTCTCATCTC 
TCAGCGGTGG 
GCATCGCCAA 
GAATCCTCTA 
ATGGAAATCT 
TGTTTACTAT 
TASCAATTOG 



ATCCACCATC 
CCTTTGCTTT 
CCACATCAAC 
GTCTGTCrrC 



TACOGCTGGT 
CTTTGCCAGC 
ATTTTCTGTT 



OTCACAGOGG 
AACTGGGGTG 
GGACTCAGCA 
CCTGCAGTGA 
TAOlTOSCAG 
CCTCCCAGTG 
CATGGTCTCC 
TGTGATTCCA 



GAAAACCATT 
TATGAGTATG 
AAAGCTGCCC 



GAGGAGAAGA 
CGCACTOAAA 
GAAACAGATT 
GTCTAAACAA 
TCCAAATCTA 
TCTAGTTACC 
CCTGACAflAA 
AGTCAATTCT 



GGATATATTG 
TCTTCTGTCC 
AGCAAACAAR 
ACCTGATTCT 
AGGGGAAAGA 
GCAOACAAGA 
TGTTATAAAT 
TAAATATTTC 



TTTCATTAAC 



TATTTGAATA 



AGATGTTGAA 
AGGAAGCTAC 
AAAACCTGGA 
CCRATCTGGA 

CTCxnrrAGAA 

TAGAAATGTG 
ATAATTTACA 
TATTTTTAAG 
AACCAATTTT 
CSTCTATCAG 
TTTATTCTAT 



TTTGGACCTG 
ATCATAGGAQ 
TTCAAACGTC 



aa: 

gaacagaaaa 

TTGCaVACCaT 
CTGTGGCCRT 
CCCAGTGCCT 
TGGTGGGAGG 
TGGTTGAGTT 
AACX3GACTGA 
ATTTATTTGC 
C3VGTTATCAT 
CTGTCCTCGC 
GTTTTAAAGA 



GTGGTGCATG 
GAGGTATTGT 
CTGTCXTCAQ 
CAGGTTTGTT 
AAGGAGGAAC 
ATGTTCrrAA 
AACCGTGTGT 
CTTATTCCTT 
TAAACTGAGT 



TGATTGACGT 
CTTCAGTATO 
ATTTCCTTCC 
GTTTCATGTC 
GGAAGAAACC 
GCAAATATAT 
CAAGATTTGG 
CTCTACTQGA 
TTAACAATOO 



51 
1 

■ ACCCACAGCA 
GGCTTTCAAA 
CATGCTTATT 
GCXniTTACCG 
GQTGCRflTGC 
GGTGTGCACC 
GGGGGCCATC 
CCTGGGAGTC 
GATAATCACA 
TGTCACTGGC 
AATCAATTAT 
GGGAAATTGG 
TGGTGGCCTT 
AOCCTTCAGC 
GAGTCAGGTA 
TGACOBGGGA 
ACTAGAAGAT 



ATATTACTCa 
TATTGTGAAT 
ACCTATTTTA 
TTAAGTCTTG 
ATATTGGTAT 



1020 
1080 
1140 
1200 
1260 
1320 
1360 



60 
65 
70 
75 
80 
85 



MSDRPTARRW GKOGPLCTRB NIMVAFKGVW TQAFWKAVTA EFLAMLIFVL LSMSTINWG 
GTBKPLPVDM VLISI.CFGI.S lATMVQCFGH ISGGHIHPAV TVAMVCTRKI SIAKSVFYIA 
AQCLGAIIOA QILYLVTPPS WGGUSVTOV KGNLTAtaBSli IiVELIITFQL VFTIFASOJS 
KRTDVTGSIA LAIGPSVAIG HLPAniYTQA SMNPARSPGP AVIWtai WENH WIYHVGPIIQ 
AVIAGGLYHY VPCPDVEFKR RFKEAFSKAA QQTKaSVMBV BDNRSQVBTD DLIIiKFGWH 
VIDVDRGBSK KGKDQSGEVL SSV 

Seq ID KO: 498 DNA sequence 
Nucleic Acid Accession fti * 
.1744 



I 

CCCCCTTGTC 
TTGGTACCX3G 
GACGGTTACC 
TGCTTGCTTT 
CATATATGGC 
CTTTTTCAAT 
CTCATATCXSV 



11 
I 

ATTAATACAT 
ATTTATACCR 
AGAGGAGAAG 
TATGTTGCTO 
ACATATTTAA 
CATGGAGAGT 
TTTCTTGTTC 
GGAAGCrTGA 



AAATAATGGA 
GACTCAGTCC 
TAATTTTTAT 
GTGGCAGCCG 
GTACCCGTGT 
TTCAGATGTT 
TTOCACTCTG 



OQGGTACATT 
ACTTTGTTTT 
TTTGGTAATT 



OATATATGTA 
GTTTTGATGT 
ATTTGGGGTA 



AATTAOE 



31 
I 

CAATCTTTAC 
CTTGATTGGT 
TATTGAAAGC 
TTTAAATGGA 
ATTAGGAGGC 
AATGTGGACA 
GCTAGTGACT 
CATTTCCAAT 
OATTGCKrCA 



I 



ATTCAAACCA 
TGTGAASaAT 
CTAATGATGG 
CTGGTTACAG 
CCAOCTCTCC 
CA TATTC TCA 
GTATTTTTCA 



AQATATQTR3 
TGGGAGATCC 
CATTATTCTT 
TGTTGTGCTT 
GTQAAAGCTT 
GGGCTACAAA 
TGCTT CCTTG 
TATATGTTOT 



375 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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TGAACTTAGT TTATC3GGTTA 
ATACTTCACA TCTAAAATTT 
ATCAAAATTC TTTAGTTATA 
TGACTTTATG QAAAAAGAQA 
TCTTGTAGTG TTTGTTGCTA 
TAAACAACAG ACACATGTAA 
ATTGCAATTG TTAGCATATA 
GACACCACAC ATGTGTGTTA 
CTTTTGCaiAA GTACATCCTG 
AQGTTCAGCA AATCTGCAAA 
AGAAGAACTT ATAGAATGGA 
CATGCCC3VCG ATGGCAAGTG 
TTATGAAGAC GCAGGCTTGA 
AGCAGCCX»A GAAGT6AAGC 
AOAGTCATGG TGTGTAAGAA 
AGAAGATCCT GCCAATGCTG 
ACCTCACTTC ACCACTGTAT 
ATGACTGCTA CATGACCTGC 
GCTAAGTCAT GTGTTGTTCA 
AACGTTTTAT TTGGTCAATT 
CAATTGGTTA CTATTTCAAT 
TACTTGACTT TAATATTTQT 
GGGTTGTGAC AAGGATGACA 
TATAGTGTAA GAACTATTAA 
TGGACATTTC AGTGATTGTA 
ACAQTAGATG GGGCAGACAT 
CTTGTGCGCT TTTTGT.TCTC 
CCAAACTCTT GGACCTTOTG 
GTGTCTCTTT CTCTCTCTCT 
AATAAGTACT GTTTACTCAT 
ATTAATGGTA ACTGTATTTT 
CITTAAACCA CTTTAAAGTT 
ACAACCCCAA ATATAGTGCA 
TGQAACAGTA QACTOTASTA 
TGACTAATTT TGCTQTCAAA 
CTTTTAAGAT TGCXriGTCTT 
TCTAAAAACC ATCATTTCAQ 
TTCTTCCTAT TTATTTTTAT 
ATTTTTGTTT CTTAGAAGTT 
ATAGCTCTGA GAAAAGGTTT 
AGACTACACA TTTAATTATA 
IGGATTCTCC TTCAGACCTT 
GCTGTTCCCT TTCTACCTTC 
CAflCTMCra ACATAASTCA 
CCCCXaGTCT TCTGCCAGCG 
■ CAGCCCAGGA GAGGGACTCT 
CTGGTTGACT GTTTTTATTT 
TTGATTTGTG GATTAAAAGC 
CGGTGGACAC TTGAGGCTGA 
TGCTTATCTG TGATTOTTGC 
ATTTTTAAAA ATTATACTTT 
TCCAAAQAAO TTCACATCTA 
CTATTATAAC TGCTTCATTT 
GAAGGATCXT TTTGTAGCAG 
TAGGGGAGCC AGTTTGGAGC 
TCGC3VGCCCA AGCACTGTGG 
TTCCATAGGC QTACAAAACA 
AATATTTTTQ CTTTAGTATG 
TTGCTACAAC ATTTTOSAAA 
AAAAATGCTT GGCAAAAAAA 
GAAAGTATGT ATCAGGAATA 
ATGCCAGTTG TTTACATGTA 



PCT/US02/12476 



TTCAAGGATG TTTTTGGTTA TTTGGAACTG 
TTGG TATTG C 
AGGATTTTGA 
CTCCACTGAQ 
TTGTTAGAAA 



C3VGCCCTTGG 
TGGCATCACT 
GTGCTATTGT 
CCCAGTGGAA 
TCAAATATAG 
TTAAGCTCTC 



TATACCXGTG 
ACATTATTGC 
GATATGTGGG 
GGAGAGCTGG 
ATGAQACTAA 
AGACAGCTAT 
TTAGCAGCAA 
GAGTTCAGCA 
GATGCAGTGT 
CCCATTGTGA 



GAGAACTGAT 
GATCCAAGCC 
GGAAAACTCC 
TCCAGAACAG 
TGCCTACGGA 
TATCCCAAAA 
TGAATGTCAT 



GTGAACTATT 
ATGCCTGAAA 
CTCTTGGTGA 



GCTAAAGTGA 
TGAAAATACA 
TOCCCCTTGC 
AGTTCTTCGQ 
GGAGTGTTTG 
TGTTCTCTTQ 
GTTAGGAAAT 
CTTGCTCTCT 
TTAGTTGCTT 
TCTCATTTTT 
TTTTCATGTT 
TCTACSAAACT 



AGATGACGCT 
TACTTTATTG 
ATACACAAAG 
GATTATTAGT 
GTTTGATCAT 
TATTTTAATT 
GATCTGCrCA 
GTTTGCTATA 
TATTGTAGGG 
TACTAAACCA 
TGCACTTCGG 
AAAAATAGTA 
AAAGTTAAAA 
TGGTTGCAGT 
CTTATGTAAC 
TGTTTACAAA 
GAACTACATC TGTAATCGTT 
ACTTTTATAG GTAACTGTTT 
TCTAATTATA AAAATGACTT 
AAATTTGCTA TGCAAATOAG 
GCAAAGCTAC CTGTATAAAG 
GG ACAAT TCT GACAATGTAG 
TTCTTTTTTC 



TTTTCXrrGCC 
TtXAGCCATA 
CTTCCCCATA 
TATTCTTGGT 
AAATGTTTAT 
TTTAAGAAAA 
TGATTAGACA 



TCATACTTAA 
ACTTACTAAC 
CAGCGGAGTT 
TTCCAGTTGT 
GTGTCTTAGC 



AACTCTTCTT 
TTGGATGGCT 
TGTCAATACA 
ATTTQCCXrCA 
TTGCGGGTGC 
ATC3VTCCACA 
ATAGTCGGAA 
AC^TTCTAGA 
TTTGGGATGT 



TTGTAAAAGA 
TTAATGTTTT 
TCAAATAGAA 
ACACCTTTAT 
TATATGCTTG 
AAAACACAGT 



CTATATAAAA 
TTAATGAAGC 
TCCCTTAACT 
CTCTTCTCCT 
ATCAAGTACT 
AGCATTATTC 
TAATTATAGT 
AATGTATATT 



TCTTOTCTTT 
AACTTQAGTT 
GTTTTACTTC 
CATTATTAAT 

TGGCATTATC 
TTTTCTGTCA 
TTTAAAAAAA 
ATTTTAATAC 
ACTATTTTGA 
TCaTTTATAQ 
CAATAAAACA 



TAATAAQAC3V 
TATAAGGAAT 
TTTGAAAAAT 
AATTTGTGTG 
TAGGGTTTTA 
CAGCTTCTCT 



ATAATOATAA 
AAGCCTTAAG 
AAGTATATTT 
TTCTACACCT 
AAATGAGATT 
AATTCTAAGC 
TTCTTAACCA 
CATQGTTTCT 



CCTTATGTTA TAATTTTGGT 
CGTCCTCCTC TTTAGTTTTT 
TCTTTGAATT CCTTGTATGA 
ePTCAAAAOS ATGAAACCTC 
AAAOCGTGAC TATGGCTGAC 
CAGGCSU3ATT AACXTTCATTG 



CCTAGAGCOS 
GGCAGCGTTC 
CTTAGGCTTT 
AACATTTGAG 
GGATGGGAGT 
TCACCTGAOT 
TACATTTATT 



TTCRGATTTG 
TGCTAGTTTT 
CGATGATGCA 
TGACATGAGC 
GTGGCTGATT 



GCTGGGAACX: 
TGTTTATGAA 
AGAGGCCTGA 
AGCATCCACA 
GTATTAAAGC 
AGGAAAGTAA 
ACAAAGTTGG 
ATATAGTGTT 
AAGTGATATT 
CTATATATGT 



ATTCTQTATA 
ATTAAAASTA 
TGTAACCCCC 
AGGTCCCTGC 



GGATGCTCCT 
TGGCCACTGT 
AGAAAATAGG 
CAACAGTCCA 
AGGGAGAGGG 
GTGTACATCC 
CTCACCCCCA 



TTCTCATTTG 
GAAGCAGCCC 
GGAAAATQGG 



TCAGTGTTTT 
GGATGGGCAA 
GGCTGTATTT 
AAAATAGGCC 



ATATAAATTA 
AGCAAAATAT 
TATGCAGCCG 
CAATGCAGAT 



AGCAGTTACA 
GTAATTTCCT 
ATTAAAAATA 
GCTTTTTCCA 
GGCTATATAT 



TAAATTAAAA 



AGAAGOGATC 
CTTTAAAAAO 
AGTQATATTA 
ATIGIATTTT 
AAAATCATQA 



ATAAGCCTCT 
ATGAGAAAAT 
TATGAATTTT 



11 



ACPYVAVIFI 
SYPFLVI.QML 
GYIDICKUSK 
ELSLWVIQGC 
DFMEKETPLR 
LQUAYTALG 
GSANLQTQWN 
YEDAGIjRART 
EDPANAGKTP 



I 

NLYPBVILAS 
LNGLMMALFF 
LVTHILRATK 
IIYIHMISLA 
PHLFGTVILK 
YTKTI.LLPW 

IVGEFSNLPQ 
KIVYSMYSRK 
I.CH1jIjVKDSK 



21 
I 

WYRIYTKIMD 
lYGTYLSGSR 
LYRGSLIALC 
LCFVIMFOIS 
YLTSKIPGIA 
LWFVAIVRK 



TKPDAVFA<3A 
VYKVLEWKE 



2040 
2100 
2160 
2220 
2280 
2340 
2400 

2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 



31 41 51 

I I I 

blGIQTKICW TVTRGEGI 
LGGLVTVLCP FFHHGECTRV MWTPPLRBSF 
ISKVFFMLPW QFAQFVLLTQ IASI.FAVYW 
MliLTSYYASS LVHWGILAM KPHPLKINVS 
DDAHIGm.l.T fiKFPSYKDFD TLLYTCAABP 



Seq ID NO: SOO DMA sequence 
Nucleic Acid Accession ft: HM_a01276.1 
127.. 1278 



I I I I I I 

ASTGGAGTCG 6ACAG6TATA TAAAGOAAGT ACASGOCCTG GGGAAGASGC CXnGTCTAGG 
TAGCIGQCAC CAGGAGCCGT GGGCAAGGGA AGAGGCCACA CCCTGCCCTQ CTCTGCTGCR 



376 



15 
20 
25 
30 



WO 02/086443 

GCCAGAATOa GTQTGAAGGC 
TGCTCTGCAT ACAARCTGGT 
GGGAGCTGCT TCCXa«3ATGC 
GCCAATATAA GCAACGATCA 
ATGCTCAACA CACTCAAGAA 
TGGAACTTTQ GGTCTCAAAG 
TTCATCAAGT CRGTACCGCC 
TGGCTCTACC CTGGACGGAG 
GCCOAATTTA TAAAGGAAGC 
TCTGOGGGGA AGGTCACCAT 
GATTTCATTA GCATCATGAC 
CACAGTCCCC TGTTCCGAGG 
TATQCTGTGG GGTACATGTT 
CCCACCTTCX3 GGAQGAGCTT 
TCAGOACCGG GAATTCXAGO 
ATCTOTtlACT TCCTCCGCGQ 
GCCACCAAGG GCAACCAGTO 
CAGTACCTGA AGGATAGGCA 
TTCC».GQGCT CCTTCTQCGG 
GCACTCGCTG CAACGTAGCC 
CCCCCTCTGG CTCCAGCTGG 
GCCTCAGTCT CCCTCCCTTG 
GCCCTGGTGG GCAGAGAGGT 
OACrCGGQAT TAQTACACAC 
TGGCAAGGGA ATTTCTTCAA 
TGGCAAGCTC TATCACCAAG 
TACOCCCTGC AAAGCCAGCT 
ACTTCCCCTT CCTAATTCCR 
OQCTTTGGTT T6GTCTATCT 
TCTTCTGGQT TCCTTCCTCT 
ATGTT 



PCT/US02/12476 



GTCTCAAACA GGCTTTGTGG 
CTGCTACTAC ACCAGCTGGT 
CCTTGACCGC TTCCTCTGTA 
CATCGACACC TGGGAGTGGA 
CAGGAACCCC AACCTGAAGA 



TCCTGGTGCT 
CCCAGTACCG 
CCCRCMCAT 
ATGATGTGAC 
CTCTCTTGTC 
ACACXXAGAQ 



GCTCCaGTGC 
GGAAGGCGAT 
CTACAGCTTT 
GCTCTACX3GC 



ATTCCTGCGC 
AGACAAACAG 
CCAGCCAGGG 
TGACAGCAGC 
CTACGATTTT 
TCAGGAGGAT 
GAGGCTGGGG 



CCGGTTCACC 



GCAAGTCCTG 
GCTCCTGCCA 
TCTTCTGAGA 



CCCTAATCRA 
TCCTGCTCAO 
CCAAGATATC 
GGCGTGGGAC 
ACAGATTCAG 
GTAAGCTGGT 
CTGGTGTTGG 
GGACCCTTGC 



AGCCACAOTC CATAGAACCC TOGOOCAOCA 



GGTAGGATAC 
GCTGGCAGGC 
CCAGOATCTG 



CCX3GGAGCCT 
GGGCCTATGC 
AGGGATQGGG 
TTGTTQATaA 



GCCATGGTAT 
CXSCTTCCCTC 
CACACAGCAC 
GATCACXTTGC 



AAAGCGTCAA 
GGGCCCIGGA 
TC3VCCAATGC 



TCGCGGGACT 
GGACCTTGCC 
GGAAATOAAG 
CGCaVGCACTG 
CCAACACCTG 
CACAGGCCAT 
CAACACTGAC 
GATGGGCATC 
AGCCCCAATC 
CTACTATGAG 
G6TCCCCTAT 



CCTGCTGAGT 
AACACACAGA 
AGTGAGGCaT 



CCTGGAT6AC 
CATCAAGGAT 
GATGCCCCQT 
CCCAGGCTGA 
TTTGAGCTCA 
CGCAATGTAA 



QAGCCRAACA TCCTACRAGA 
TGAAACCTTC ACn^OQAAC 
CAOCICCTCA ATAAAGTACA 



CTTATCKAAG GACACCATTT 
cacastgacc atactaatta 
GnCAATCSTOT, CCCCTATCCT 
agagtttaac aotgtgttgg 
CTOQACTCAC CTCCCCCATC 



1030 
1080 
114,0 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



35 
40 
45 
50 
55 
60 
65 
70 



11 



21 



31 



41 



51 



I I I I I I 

MGVKASQTGF WLVLLQCCS AYKLVCTYTS WSQYREGDGS CFPDALDRFL CTHIIYSFAN 
ISNDHIDTWB VnroVTIjYGMIi NTUCNRNPNL KTLIjSVGGWN FGSQRFSKIA SNTQSBiRTPI 
KSVPPPUtTH GFOGLDLAWL YPGRROKQHF TTLIKETUCAE PIKBAQPGKK QLLLSAALSA 
GKVTIDSSYD lAKrSQHLDF ISIMTYDFHQ AWHGTTGHHS PLPRGQEDAS PDHFSNTDYA 
VGYMLRLGAP ASKI.VMGIPT FGR5FTLASS ET6VGAPISG PGIPGRFTKE AGTLAYYEIC 
OPLROATVHR TLGQQVPYAT KGNQWVGYDO QESVKSKVQY LXDRQLAGAM VWALDLDDFQ 
GSFCGQDI.RF FIiTNAIKDAIi AAT 

Seq ID NO: 502 DMA sequence 

Nucleic Acid Accession #: NM_00S474.1 

Coding sequence: iai..6S9 



I 

GCTGCCTAGG 
TCCGGCCCCC 
TTCCCCCAGC 
ATGTGGAAGO 
GAAGGAQCCA 



GTCTGGAAAQ 
CCACCX3TCGC 
TCAGAATCTT 
TGTCAGCTCT 



CTCGGGCACC C 
GCTCCTCCAG G 
GCTGCTCGGC C 
GCTCTTCGTT T 



AAGTCTGGCT 
GAGGATCTGC 
GCCTCAAACQ 



CAGQTGCCGA 
TGACAACTCT 
CAACTTCAGA 
TGGCX31CCAG 
ATGGTTTGTC 
TCATTGGTGG 



TOVCCAGATT 



GTGGCCCTGT 
CAOATTCCAC 
TOGTTCTTAA 



AGATGATGTG 
GGTGGCAAOV 
AAGCACAGTC 
TCACTCCaCG 
AACAGTGACC 
AATCATCGTT 
GTTACQCCCT 
CCCTGAGCTC 



GACACTGAGA 
GTGACrCCAG 
AGTGTCAACA 



C GGGGCTCCTG 
3 TGGCCGCGGT 
fi GCAACAACTC 
3 CQTOGCTCTd 



CTCcxavcccc 

GCTTTTAATT 
AACGGGAACG 



GAGAAAGTGG 
CTGGTTGGAA 
GTGGTTATGC 
GCTTGCCAAC 



AACAAAGTCC 
ATGGAGACAC 
TCATAGTTGG 
GAAAAATGTC 
QTOCTTTAAA 
GATGACCCTG 



AAGOGCCACA 480 

ACAGACAACA 540 

GGTCTTACTA £00 

GGGAAGGTAC 660 

AAAAGACOGT 720 
GGAACATTTG 



CGTTTGCCAA ATTAACCXJAG GAAAOACCTT 840 



i I I I I I 

MWKVSAIiIiFV IiGSASLWVIiA BGASTGQPED DTETTGIiEGQ VAHFGABDDV " 
KSGLTTLVAT SVNSVTGIRI EDLPTSESTV HAQEQSPSAT 
VEKDGLSTVT LVGIIVGVLI. AIGFIGOIIV WMRKKS6RY 



Coding sequence: 62.. 8 95 



11 



21 



31 



41 



I - I 1 I I 

CACTGCTCTG AGAATTTGTG ASCAGCCCCX AACAGGCTGT TACTTCACTA CAACIOAOQA 
TATGATCATC TTAATTTACT TATTTCTCTT GCTATGCOAA GACACTCAAG GATOGGGUITT 
85 CAAGGATOGA ATTTTTCATA ACTCXZATATG OCTTOAACaA GCWXXISGTO TGTACCACRG 
AGAAGCACGO TCTGGCAAAT ACAAGCTCAC CTACGCAGAA GCTAAGGCOG TGrOTQAATT 
TGAAGGCGGC CATCTOGCAA CTTACAAGCA GCTAGAGGCA OCCAOAAAAA TTOGATTTCA 



377 
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TGTCTGTGCT GCTGGATGGA 
GCCCAACTGT GGATTTGGAA 
TGAAAGATGG GATGCCTATT 
AGATCCAAAG CAAATTTTTA 
CTGCTACTGG CACATTAGAC 
TGACCTTGAA QATGACCCAG 
TGATGTCCAT GGCTTTGTGG 
TACAOGAAAT GTCATGACCT 
CXyAATCAAA TATGTTGCAA 
TACTACTTCT ACTGGAAATA 
AAAAAAAGGA TGATC»AAAC 
CTCSVCTGTTA TTATTAACAT 
TAGGGAAAAT TGGAAAATAT 
ACTGCATAGA AATAACAAGC 
TTTGTGGTAT A TGTAT ATAT 
TCTATGTACA GTTTTGTATT 
TCATTQATTA TTCTACAAAA 
TQTTTTATGC ATTATTTAAG 
ATTGTTGCAA TAAATATCCT 



TGGCTAAGGG 
AAACTGGCAT 
GCTACAACCC 
AATCTCCAGG 



CAGAGTTGGA 



OAAOATACrG 
TGAAGTTTCT 
TGGATCCTGT 
AAAACTTTTT 
ACACAGTGTT 
TTATTTATTA 



ACAOGCAAAG 
CTTCCCAAAT 
TCAGCXSTATT 
TGATTATGTT 



GTTAACATTT 
GTACCTATAT 
ATACTTTTTA 



CCTOrCTCTA 



TATGTTGGAA 
TTTTTCTAAA 
AAACGAGAAA 

TCATATTrrr 

GTATTTGCAT 
AATCITGAAC 
AAACAGCTGT 



TOAAGCCAGG 360 
GGAATCCGTC TCAATAGGAG 420 
GAGTOTGGTG GCGTCTTTAC 480 
GAGTACGAAQ ATAACCAAAT 540 
CACCTGAGTT TTTTAGATTT 600 
GAAATATATQ ACA6TTAGBA €60 
CTTCCAQATQ ACATCATCAS 720 
? TCAOTGACAQ CTGSAG6TTT 780 
AAAATACAAG 
TATAAAAAAA 
CTCCTTTGAT 
ATACATAATT 
TC ATAATC CC 
CATTTTTCTA 
G6AATCCIGC 
TTTTCTGAAA 
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TGTGAAAGCA 
ATQAAACCTC 
TTCTTTCAGT 
TTGAAATTTT 
TTTATAAACA 



TOAACACACA AAAAAAAAAA AA 



kT TTCAGOTCAT TTTCATAAAT 



11 



21 



GAATTCGCAC 
CTGACGATAT 
GGGGATTCAA 



GTGAATTT6A 



ATAGGASIGA 
TCTTTACAGA 
ACCAAATCTG 
TAGATTTTGA 
GTTACGATGA 
TCATCAGTAC 
GAGGTTTCCA 
ATACAAOTAC 
AAAAAAAAAA 
TTGATCTCAC 
TAATTTAGGG 
ATCCCACTGC 
TTGTATTTGT 
.CCTGCTCTAT 
TGAAATCATT 
ATGAATGTTT 
TAAATATTQT 



11 
I 

TGCTCTGAGA 
GATCATCTTA 
GGATGGAATT 
AGCACXSOTCT 
AGGCGGCCAT 
CTGTGCTGCT 
CAACTGATGA 
AAGATGGGAT 
TCCAAAGOSA 
CTACTGGCAC 
CCTTGAAGAT 
TQTCCATGGC 
AGGAAATGTC 
AATCAAATAT 
TACTTCTACT 



21 
I 

ATTTGTGAGC 
ATTTACTTAT 

•rrrcATAACT 

GGCAAATACA 
CTCXjCAACTT 



AGCCCCTAAC 
TTCTCTTGCT 
CCATATGGCT 
AGCTCACCCA 



GGATGGATGG CTAAGGGCAG / 
TTTGGAAAAA 
GCCTATTGCT 
ATTTTTAAAT 



GACCCAGGTT 
TTTGTGGGAA 
ATGACCTTGA 



ACAACCCACA 
CTCCAGGCTT 
AGTATGGTCA 
GCTTGGCTGA 

AGTTTCTAAG 
ATCCTOTA TC 
ACTTTTTAGC 



CGCAAAGGAG 
CCCAAATGAG 
GOGTATTCAC 
TTATGTTGAA 



TGTTATTATT 
AAAATTGGAA 
ATAGAAATAA 
GGTATA TQTA 
OTACAGTTTT 
GATTATTCTA 
TATGCATTAT 
TGC31ATAAAT 



AACATTTATT 
AATATAGOAA 
C31AGCGTTAA 
TATATGTACC 
GTATTATACT 



TATTATTTTT 
ACTTTAAACG 
CATTTTCATA 
TATATGTATT 
TTTTAAATCT 
ATTTTAAACA 
CTCTATTBTT 



1200 
1260 
1320 
1380 



I I I I I 

MIILIYLFLL LWEDTQGWGF KDQIFHHSIW LERAAGVYHR BARSQKYKLT YAEAKAVCCP 
EGGBLATYKQ LBAARKIGFH VCAAGWMAKG RVGYPIVKPQ PNCGFGKip.I IDYGIULHRS . 
ERWOAYCYNF HAKECGGVFT DPKQIFKSPG FPHEYEDNQI CyHHIRUCYG QRIHLSFLOF 
DIiEDDPGCLA DYVEIYDSYD DVHGFVGRYC GOEZ.PDDII 
QIKYVAMDPV SKSSQGKNTS TTSTGMKNFL AORFSHL 

Seq ID NO: 506 OHA sequence 
Nucleic Acid Acceaslon ft: NM_00711S.l 



AGGCT6TTAC TTCACTACAA 
ATGGGAAGAC ACTCAAGGAT 
TGAACGAGCA GCCGGTGTGT 
CQC3U3AAGCT AAOGCGOTOT 



ITTQIGA 360 



T6TGGTGGCG 
TACBAAGATA 
CTGAOTTTTT 
ATATATQACA 
CCAOATGACA 
GTGACAGCTG 
CAAGGAAAAA 
AQCCACTTAT 
TGGAACTCCT 
AAQAAATACA 
ACCTCTCATA 



TGATGCTTCA 
CAAATCCAGT 
TGGAAGATTr 
TGGAATCTTT 
CTAAATGTGA 
AGAAAATGAA 



TGCATTTGAA ATTTTGGAAT 
TOAACTTTAT OAACATTTTC 
GCTGTAAAAT ATTCTATOAT 
QOM-TTTCAa i 



70 
75 
80 



MIILIYbFUi I.WEOTQGWGF KDGI] 
BGQHIATYKQ LBAARKIQPH VCAAOWMAXG RVGYFIVKFO PNXXFQKIGI IDXQIRLHRS 
ERHQAYCXNP HAKECQOVFT DPKRIFKSPG FPNBYEDKQI CXHHIRUnG QRIKLSFIiDF 
DIiBDDFGCbA DYVEIVDSYD DVHOFVGRYC GOBLPODIIS TGHVMIIjKFb SDASVTAGGF 

QiKWArmpv sKssooKirrs TTSvanoiFL agrfsbl 



Seq ID HO I 508 DMA sequence 
Nucleic Acid Accession #> N 
coding sequence: 129. .1991 



GTGTGOCCAT OA6TAAGAGC AAAT0CTCC3Q 
CTAAGOAQCC CAATGCXX3TG GOCCCQAAOa 
ACGGASTGCA GCICACC3VaC TCCAOCCICA 



41 51 
I ! 
aSGCGCCAGG ACTCGCXSTGC 
ACAGAATTCC TCAACTCCCA 
GTCTTCOSTO G 
CRTCCTK 
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TCCTGGTCCC CTACCTGCTC 
TGGCCCTCGG CCAGTTCAAC 
TGAAAGGTGT GGGCTTCACG 
TCATCATCGC CTGGGCGCTG 
TCCACTGCAA CAACTCCTCG 
GTGQAGACAG CTCGGGCCTC 
AAOGTGGCGT GCTGCACCTC 
GGCAGCTCAC AGCCTGCCTG 
GCOTGAAGAC CTCAGGGAAG 
CTGCCCTGCT CCTGCGTGGG 
TGAQCGTTGA CTTCTACCGG 
TGTQCTTCTC CCTGGGCOTG 
TCACCAACAA CTGCTACAGG 
TCTCCTCCGG CTTCGTCGTC 
CCATCGGGGA COTGGCCAAQ 
TCGCCACGCT CCCTCTGTCC 
TGGGTATCGA CAGCGCCATG 
TCCAGCTGCT GCACAGACAC 
TCCTGTCCCT GTTCTGCGTC 
TTGCAGCCGG CACGTCCATC 
TCTATGGTGT TGGGCAGTTC 
TGTACTGQCG GCTGTGCTGG 
TCAGCATTGT QACCTTCAGA 
ACGOGCTGGG CTGGGTCATC 
ACAAGTTCTG CAJSCCTGCCT 
AGAAGGACCG TGAGCTGGTG 
TCAAGGTGTA GAGGGAGCAG 
ACX3AACAAAC C3VAGGAARTC 
CTCTACTGAA AACACAAACA 
TCCGTGCCX3G QAGCGCACCT 
GCAGCGAGGT CCRCCCCGTT 
TCTGTGTGAG GCTCCCTCCC 



AGGGAAGGGG 
GTCATCCTCA 
CACTATCTCT 
AACAGCCXXA 
AACGACACTT 
CACCAGAGCC 
GTGCTGGTCA 
QTGGTATGGA 
GTCACCCTCC 
CTCTGCGAGG 
OGGTTCXjaGG 
GA06CGATTG 
TTCTCCTTCC 



TTGCTGGQAT 
CCX5CTGGTGT 
TCTCACTGTA 
TCTCCTCCTT 
ACTGCTCGGA 
TTGGGACCAC 
ATGGCATCGA 
TCGTGCTGCT 
TCACAQCCAC 
CTCGAGCCAT 
CGrCTGTTTG 
TGCTGATCGC 
TCACCACCTC 
TGGGGTACAT 
GGCTGATCTT 
CCGTGGTCTT 



GCCaCTTTTC TACATGGAGC 
CTGGAAGATC TGCCCCATAC 
TGTCGGCTTC TTCTACAACG 
CACCACGGAQ CTCCCCTGGA 



PCT/US02/12476 



ACCTGCTGCC QM3TACTTTQ 780 
CGACCTGGG6 CCTCOGGOOT 840 



CTACTTCAGC 
CATGCCRTAC 
AGACX3GCATC 
GATTGACXXX3 
CTTCTCCAGC 
CATCAACTCC 
GGCACAGAAQ 
CATCRTCTAC 
CTTCATCATG CTGCTCACCC 



CTCTGGAAlOQ 
GTGGTCCTCA 
AQAGCATACC 
GCCACCCAGG 
TACAACAAGT 
CTQACGAGCT 
CACAGTGTGC 



GGTGGTATGG 
CGTGAGCTCT 
ACCAACGGTG 
CTCTTTGGAG 
AGCGACGACA 
AAGCTGGTCA 
CCCXX:CXACT 
GCCACATCCT 
GGGTCCTTTC 
GACAGAGGGG 
A6ACX3AAGAC 
TAAGTTTCXa 
ACAAAGCAGA 



TCACGCTCTT 
GCATCTAOGT 
TGCTCaVTCGA 
TCCAGC3VGAT 
GCCCCTGCTT 



CATCGTOCTG GCX3ACCTTCC 
CTTCACGCTC CTGGACCATT 
AGCCATOSGA GTGGCCTGGT 
GACOGGGCAG CGGCCCAGCC 
TCTCCTGTTC GTGGTCGTGG 
CaTCTTGCCC QACTGGGCCA 



CTCACAGTAG CTTCCTAaAC 



GTTGTCCCTO 
TCCCTGCTCC 
ATCAOG ATCC 
CATTTACTTT 



AGACTCCTCT 
TGTGTTGCTO 
CAGGGCAGAA 
CTGCTCCCGG 



GTTCAOGCTC 
OVTCCTGCAA 
GGCAACTTCT 
CTTCTGACTG 
TAATAAOSAC 
AAACGTCTAA 
CTCTGAGGCr 
ACCTGCIGAG 



ACTCTTCAAC' 
TTTACACCTT 
aTAGATCTGT 
CTTCATGCTG 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



1740 

1800 

1860 

1920 

1980 

2040 

2100 ' 

2160 



GATGCGTOGC TCCCAGCAGA 
GTCTGTTCAG AGGCATTGGA 
ATGTTCTTGC AGCAGAOAGA 
GGCAGCCTGT GGGTCXrTTGT 
GACGCATGCA GGGCCCCCAC 
AGOAGCATGT CCTATCCCTQ 
AACGCATQCA GGGCCCCCAC 



GGCCGTAAAT 



GGTGTAGGGA 
AGGAQCGTGT 
GACGCATGCA 



AAACCACAAA 

TGAGOGTTCA 
CCTGGTATGT 
CTTGAAACCA 
ACGGCCTGAQ 
CCTATCCCCG 
GGGCCCCCAC 
ACTACCCCAS 



TTCATGCAAA 
GTTGACACAT 
CTCACCAGQA 
GCTCAGGCTA 
AGGAGCGTGT 
GACGCATGCA 



ATCAAAACAA 



CGTGTACTAC 
ATGCAGGGCC 
CCTCCAGGAA 
AACAGTTTTT 



CACACTQCCC T 



AATTCTGTTT 
CTGCCACrCA 
CCTATCCCCG 
GGGCCCCCAC 
ACTACCCCAS 
GGGCCCCCAC 
CCTATCCCCG 

CCCACAGGftG CGTGTACTAC CCOkOGAOGC RTGCftOGGCC 
CCCAGGATGC 
CCCATGCAGG 
GGGACCCCAC 
ATGTTTGCGA 
GCA6TATCOG 
TACTQAGTQA 



CAGCCTGCAG 
TGGAATTTTA 
ATGGCTTTTT 
CGAGCCTGCT 
AGGATGTTGO 
GGATGCCATA 



CCCACAGGAG 
ACCAACACTC 
TTTCTCTCAG 
AAAATCATAT 
TGCTGATATT 
CCAAAAGCTG 
GTTTGAATTC 



CGTGTACTAC 
TQCCTGGCCT 
GTGCGTGCCA 



TCCCCTGCAA 
GCCTGTGAAC 
GGGAGGGACA 
GCGGCTTCCC 
GTTGTTQAAQ 
AGCAACCCAG 
TAAGCACAAT 



CAGAGGACGG 
CATTGCCTTC 
ACAGCACA6A 
GTGTTGTCCQ 
AAAA6ACATC 



GCTCCAGGQA CTGOAGTOTA 
CTGCAOTTAG CACAGAGGAT 
CTTCCCCATC GCCTTCTGGC 
TGGGGAGGGA CACAGAGGAC 
GAGCGGCTTC CCCATCGCCT 
TOTCTGTTQA CCAATCTCTA 
CACAATGOAA AAAAAAAAAG 



ATGCTCGGTG 
GGCTTCCCCA 
CGCTGCAQTC 
AGTTTCCCCA 



GCAGTTTTTG 
CTTTCCATGG 
ATGCCTCAAG 
GTCATGGCTG 
GGAGCC6TCA 
TTGCCTTCTG 



2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 

3420 
3480 
3540 
3600 
3660 

3780 
3840 
3900 



I 



21 
I 

3 PNAVGPKEVB 
ETHGKKIDFL LSVIGFAVDL AKVWEPPYIjC 
GQFMREGAAG VWKICPIUtG VGPTVILISli 
MNSNNSPNCS DAHPGDSSGD SSGIiNDTFGT 
TACLVLVIVL LYFSLVflCGVK TSGKWWITA 
/ WIDAATQVCP SIjGVGPGVI.1 



r IT6I.IDGFQI. UntHREUrTL 
GTSII.F6VLt BAIGVAHPYQ VGQFSODIQQ 
!^ yiProWAMAL GKVIATSSHA 
ft QFTLRHHIiKV 



Seq ID MO: 510 DMA s 
Nucleic Acid Accession «i im_0O12l6.a 
43.. 1422 



31 

I 

LILVXSQNGV 
IfKMGGGAFIiV 
yVGFFXNVII 
TFAAHYFERG 
TMPYWIiTAL 
AFSSYNKFTO 
PIIYPBAIAT 
FIVLATFLLS 
MTGQRPSLYW 
MVPIYAAYXF 



41 
I 

QLTSSTLUNF 
PYLLFMVIAQ 
AWALHYIJ'SS 
VIiHIiHQSHGI 
IiUlGVTLPaA 
NCYRDAIVTT 
LPIiSSAMAW 
LPCVTNGGIY 
BLCWKLVSPC 



FTTELPWIHC 
DDIiGPPRWQL 
IDGIRAYIiSV 



FPIMLLTLGI 
VPTLLDHFAA 
PLLFWWSI 
LAYAIAPEKD 



11 



21 



31 



41 



51 



I I I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAOTCftGCC GCATGGCTCC CCTGTQCOCC 
AGCCCCTGGC TCCCTCTGTT GATCCOGGCC CCTGCTCCRG GCCTCACTOT GCAACTGCTG 
CTGTCACTGC TGCTTCTGAT GCCTGTCCAT (XXXAQAGOT TGCCCCOGAT GCAGGAGGAT 

Tcccccrroa gagqaggctc ttctggggaa. gatgacccac tggqcqaosa gsatctgccc 



379 



20 
25 
30 
35 
40 



WO 02/086443 

AGTGAAGAGG ATTCACCCAG 
GA.GGATCTAC CTGGAGAGGA 
TCCCTGAAGT TAGA(3GATCT 
JVATARTGCCX: ACAGGGRCAA 
CCGCCCTGGC CCCGGGTGTC 
CGCOXCRGC TCMCCGCCTT 
CrCKCXSCQGC TCCCAGAACT 
OCTCCTGGGC TAGAGATGGC 
CTGCACTGGO GGGCTGCAGG 
CCTGCCGAGA TCXaVCXTTGCT 
OGGCQCCCOG GAGaCCTGGC 
AGTCCXrrATG AGCAGTTGCT 
CAGGTCCCAC3 GACTGGACRT 
TATGAGGGGT CTCTGACTAC 
CAGACAGTGA TGCTGAGTGC 
GGTGACTCTC GGCTACAGCT 
GAGGCCTCCT TCCCTGCTGa 
AATTCCTGCC TGGCTGCIGG 
ACCTVGOOTOG OGTTCCTTGT 



AGAGGAGGAT 
GGATCTACCT 
ACCTACIGTT 
AGAAGQGQAT 
CCCAGCCTGC 
CTGOCCGGCC 
GCGCCTGOGC 
TCTGGCTCX:C 
TCGTCCGGGC 
TCACCTCAGC 
CX3TGTTGGCC 
GTCTCGCTTO 
ATCTGCACTC 
ACOGCXJCTGT 
TAAGCAGCTC 
GAACTTCOOA 



CCACCCGGAG 
GAAGTTAAGC 
GAGGCTCCTG 
QACCAGAOTC 



AGGAGGATCT 
CTAAATCAGA 
GAGATCCTCA 
ATTGGCQCTA 
TCCAGTCCCC 



TOGGAGCACA 
ACCGCCTTTG 
GCCTTTCTGG 
GAAOAAATCQ 
CTGCCCTCTG 
GCCCAGGGTG 
CAC3WXXTCT 
GOQAOGCAGC 



ACOGGGCTCT 
CTGTGGAAGG 
CCAGAGTTGA 



ACCTGGAGAG 
AGAAGAGGGC 
AGAACCCX3VG 
TGGAGGCGAC 
GGTGGATATC 
GGGCTTCCAG 
ACTOACCCTO 



PCT/US02/12476 



CTGAGQAAGG 
ACTTCAGCCG 
TCATCTGGAC 
CTGACACCCT 
CTTTGAATGG 



CCACCGTTTC 
CQAGGCCTTa 
GGAAGAAAAC 
CTCAQAGACT 
CTACTTCCAA 
TGTGTTTAAC 



AOTGGACAGC AOTCCTOGGG CrOCTOAOCC AGTCCAGCTG 
TOACATCCTA GOCCTOOTTT TTGGCCTCCT TTTTGCTSTC 



ATGCCACTTC CTTTTAACTG 



GGIAGCCQAG ACTGQAGCCT AGAGQCTWSA TCTTO GAGAA 
CATCTGAGGG GGAGCCGGTA ACTOTCCTGT CCTGCTCATT 
CXAAGAAATT TTTrAAAATA AATATTTATA AT 



31 



1140 
1200 
1260 
1320 
1380 
1440 
1500 



I I t 

MAPIiCPSPWI. PI.I.IPAPAPG LTVOLUiSLI. LLMPVHPQRL PRMQEDSPU3 GQSSGEDDPL 
GEEDLPSBED SPREEDPPGB EDIiPGEBDLP GEEDLPBVKP KSEEEGSLKI> EDLPTVEAPG 
DPQBPQNNAH RDKEGDDQSH VJRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 
EhUSFQl^Ph PBLRIJUINGH SVQLTLPPOI. EMALGPGRSY EALQUHLHWG AAGHPGaEHT 
VBQKRFPAEI HWHI.STAPA RVDEAIiORPa GLAVLAAPLB E GPEBH SAYB Ql,IiSRLBBIA 
EEaSBTQVFG LDISALLFSD FSRYFQVBGS I.TTPPCAQQW IHTVPHQTVM LSAKQI.HTI.S 
DTLHGFGDSR LQLNFRATQP 12IGRVIBRSF PAGVDSSPRA AEPVQLNSCL AAGDIIiALVF 
GLLFAVTSVA FLVQMRRQHR BGTKGSVSYR PAEVAETQA 

Seq ID KOt 512 DNA sequence 
Nucleic Acid Accession #> Bos sequence 
.3978 



45 

50 
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60 
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85 



ATGGTGOGTa AAGGACCCTA CCTTATCTCA GATCTGGACC AGCGAGGCOB 
TTTGCAGAAA 
TTAGCACCCA 
ACGCCGGTGA 



OATATQACCC 
ACCCGGTGGA 
TGGTGAAAGG 



GTAGCAAGGG 
ACAOSCGTGT 
CCGACAGTTC 
GTTGGCATTG 



TOGQTCCTGA 
TGATGGACAT 
TCATTCACCA. 
OACTSTGCAT 



CTCAATATAC 
CXaCCXACCA 
CCCACAGCTC 
GCCAAGCTCA 
ACAATGAATG 
TTTACCAACA 
TTTGTCCAAA 
ACATTATCCT 
ATTGOCATGT 
ATGGCTGAAG 
CCATCTTACA 
TTGACATGGG 
AGGCATTTAT 



TTTGTGGTGA 
GCAGTCTTTG 
AAGGATGAAT 
GCCAAATACC 
CTCCTTGCAG 



AAAACCTAGT 
TGTCAAGTGA 
TCCCX3ATCCT 
TCATCGGGAT 
ATTCAGCTTT 
AGTTTCTGAC 
CTATCCAAGA 
GTGGAAACTC 
GCCACATCXrr 
TTAATGTAAT 
aSAATGTCTC 
TCACXXa^ACC 
AGCATGAAGC 
GCAAGAAACA 
GCCCAGAGGA 
GAAAOTTATO 
TTGGGAGAAT 
CTAGAAGGCT 
TGGGGAAGAT 
CrCTCCTAGG 



TGAT6CCGGG 
CTACCGGCAA 
CACCAATQCC 
GAAGGCCTCT 
CGTGGCCAAC 
AATCCTCCAG 
AGCCCTTTTT 
CTACCGCACQ 
GTCCTTCAAG 
TAGCTATTCT 
AATGGTCTTT 
ATCAGTGTAT 
CCGAAGGTCA 
CTGCRTCAGG 
TATAAGAAGG 
TGCCCTGGCC 
CCTGAGACQC 

GAAorrrrcc 

TCTAAGGAGA 
AGAAOACCCA 
CA6CAGGAAA 
GAGGTCA6AG 
GCAAAGTGAC 
TCGTTATCCC 
CATCAGAQGA 
TCTTACTTGG 
CnCGGGAATA 
ACAGATCCA6 



AAAAGATTTC 
CTGAGCCACG 
ATCCTGTGCA 
CAGACTGA6A 
GCCACCGAGT 
GCCATCCGGT 
ACATTGACCC 
TTGTTTGAAG 



CAGTGCGACC 
TOGCCACATT 
TAGACACCCX 
GAGTCCTTTQ 
TGGTGTGGAA 
TCATCATGGC 
GGACXrrCTGG 
TTACCAAAGT 
TGAAGGTGGC 
ACATCTCTGT 
CTQCCTTGTT 



GCGGA6ATCC 
CTGTGCaVAGQ 
TTCCTGGCTC 
GCiCCCCATTG 



GCTCTCCACC 600 



GCAATTTTGa 
CTGATCAAAA 
AGGGAAAGAA 
CCCATCGTGT 
AAACrCACCG 
ATTGCAATCT 
ATOAAOAAAA 
OATACIGTCT 
ASTACCCCAA 
GCATACAGTG 
AQCCTCAAAT 



TQACASACAA 
TGTATGCCTG 
AATTACTGGA 
CCACCATAGC 



TTGTCCTTTG 
CATTCTGGGO 
GATGTTTATa 



GGAGAAATCT 
AAAAGCTGGA 
CATCGTGCTG 



CTCTTIGGAO 
CAGAAGGACC 
AACCTCTCTG 
CAGCTCTACC 



TACAQGOCTC 
(XCCAAGAAG 
TQTGGQAATG 
CTGCAGAAAG 
ATCITTC3VIG 
TATCKSCACh 



naAAATCGCA 
AGAGQAGTCC 
CGGTTC TGCA 
TCCTGGCTTG 
ATGGATTTTC 
TGGAtAGGAC 
TGGGAAGTGG 
GGGTGGT6GC 
GAAATGKSUS 



•AGC AAATGCCACC 1320 



ACCAGCCAAS 
CAGCATAAGC 
GAGGTGGCCA 
TGCTAAAGAC 
TCAAAGGGCA 



AOAAAACATA 



TOAOCAACCT C 




CCCCXZTGTCO GCXXrrGGACG 
GACGCTCAGG GGAAAGACAG 
TGATGAAGrrr ATTTTATTAG 
AATGGAGGAO AGAGGGCGCT 
GGATCCTOAA CACXn;TTACA 
QAGAlQAGGAA GATGCTGGTA 



OaOCTCAOGG ATOACCrGTC 
GCrGGCAGAC ATCGSTCAaC 
GCTOGTOTTT GGOSTCAOCR 



ASAS^TGGGQA GCOSGGCCTC 
SCGCTGTCTA CTCCGACCOT 
CXXACQTGGG GAAGCACGTC 
TCQTCCTGGT GACCCACCAG 
AAGATGGAGA GATTTGTGAA 
ATGCAAAACT GATTCACAAC 
ATGCAGCAAT GGTGGAAGCC 
TAATCX3GGTA CCTOCTTTCT 
CTOOCTTCAG 



AAG6CTTGOT CITCACCAAS 



1380 
1440 

1500 
1560 

1680 
1740 

laoo 

1860 
1920 
1980 



2460 
2520 
2580 
2640 



380 
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ACCRCACTGft TGGCATCCTC CTCTCTGCAT GACACGGTGT TTGATAAGAT CTTAAAGAGC 2700 

CCAATGAGTT TCTTTGACAC GACTCCCACT GC3CAGGCTAA TCAACCGTTr TTCCAAGGAT 2760 

ATGGAOSAGC TGGATGTGAG GCTGCOGTTT CAOGCAGAGA ACTTTCTGCA GCAGTTTTTT 2820 

ATGGTGGTGT TTATTCTOGT GATCTTGGCT GCTGTGTTTC CTGCTGTCCT TTTAGTOQTG 2880 

GCCAGCXnTG CTGTAGGCTT CTTCATTCTG TTACGCATTT TCCaCS^GAGG AGTCCAGGAG 2940 

CTCAAGAAGQ TGGAGAATQT CAGCCGGTCA CCCTGGTTCA CCCACATCAC CTCCTCCATG 3000 

CAGGGCCTGQ QCATCATTCa OGCCTATGQC AAOAAGGAGA GCTQCATCAC CTATACTTCA 3060 

TCCAAAGGCC TGTCATTGTC ATAC3KTCATC CAGCTGAGOG GACTGCTCXA AGTGTGTGTG 3120 

CGAACGGOAA CAGAlGACGCA AGCC3U«^TTC ACCTCCGTGG AGCTGCTCAG GGAATACATT 3180 

TCGACCTGTG TTCCTGAATG CACTCATCCC CTCAAAGTGG GGACCTOTCC CAAGGACTGQ 3240 

CCCAGCTGTG GGGAGATCAC CTTCAGAGAC TATCSiGATGA GATACAOAGA CAACACCCCC 3300 

CTTGTTCTCG ACAGCCTGAA CTTQAACATA CAAAGTGGGC AGAa«»TCGG GATTGTTGGA 3360 

AGAACAGQTT COSGAAAGTC ATCGTTAOGA ATGGCTTTGT TTCGTCTGGT GGAGCCAGCC 3420 

AGTGGCACAA TCTTTATTGA TGAGGTGGAT ATCTGCATTC TCAGCTTGGA AGACCTCAGA 3480 

ACCAAGCTGA CTGTGATCCC ACAGGATCCT GTOCTOTTTQ TAGGTACAOT AAGGTACRAC 3S40 

TTGQATOCCT TTGAGAGTCA CACC6ATGAG ATGCTCtGGC ASeTTCrGGA GAGAACAXTC 3S00 

ATGASftGACA CAATAAW3AA ACTCCCAGAA AAAT TACAGG CAGAAG TCAC AOAAAATGGA 3660 

OAAAACTTCT CAGTAGGGGA ACXiTCAlGCIO CTTTGTGTQO CCOQAGCTCT TCTCaSTAAT 3720 

TCAAAOATCA TTCTCCTTCxA TGAAGCC3VCC GOCTCTATGa ACTCXSVAGAC TOACACCCTG 3780 

QTTCAGAACA CCATCAAAGA TGCCTTCRAS GGCTOCACTa TQCTGACCAT CGCCCACCGC 3840 

CTCAACA.CAG TTCTCAACTG CGATCACGTC CTGGTTATGO AAAATGGGAA GGTQATTOAG 3900 

TTTGACAAGC CTGAAGTCCT TGCAGAGAAG CCRQATTCTQ CATTTGGGAT GTTACTAGCA 3960 
GCAGAAGTCA GATTGTAG 

Seq ID HO I 513 Protein seqpience 
Protein Accession 8: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MVGEGPYLIS DLDQRGRRRS FAERYDPSLK TMIPVRPCAR lAPNPVDDAG I.LSFATFSWI1 60 
TPVMVKGYRQ RLTVDTLPPL STYDSSDTHA KRFRVLWDEE VARVGPBKAG LSHWWKFQR 120 
TRVLMDIVAN II.CIIMAAIG PTVLIHQILQ QTEKTSGKVW VGIGLCIALP ATBFTKVFFW 180 
AUWAINYRT AIRLKVALST LVFEHLVSFK TLTHISVGEV UniiSSDSYS tiFEAAIiFCPL 240 
PATIPIIMVF CAAYAFPILO PTALIGISVY VIPIPVQMFH AXIMSAFRRS AILVTOKRVQ 300 
TMIBFLTCIR LIKIWAMEKS FTOTIQDIRR RERKLLEKAG FVQSGMSAIA PIVSTIAIVl. 360 
TLSCHILLRR KLTAPVAFSV lAMFNVMKPS lAILPPSIKA MABAHVSLRR HKKII.IDKSP 420 
PSYITQPEDP DTVLLLAHAT LTWEHBASRK STPKKIiQNQK RHLCKKQRSE AYSERSPFAK 480 
GATGPBEQSD SLKSVLHSIS FWHKLCRYP EAQIJiAMRWP AVFWGRIIRG YRPHGFSAKD 540 
KDESRRLIiTW PQEVDRTQRA AKYLQKILGI COIVGSGKSS LLAALLGQMQ LQKGWAVNG 600 
TLAWSOQAW IFHQIVSENI LFGEKVOHQR VQHTVRVCGli QKDLSNliPYG DLTEIGERGD 6S0 
KIiSGGQRQRI SLARAVYSDR QLYLLDOPLS AVDAHV6KEV FEECIKETLR GKTWIiVTHQ 720 
LQFLBSC33BV ILLEDGBICE KGTHREIiMEB RGRYAKLIHH LRGIiQFKOPE HLYNAAMVBA 780 
FKESFAEREE DA6IIGYI>I>S LFTVPLFIiLM IGSAAFSNWM LGI.WLDKGSR MTOGPQOIRT 840 
HCEVQAVLAD IGQHVYQWVY TASMVFMLVP GVTKGFVFTK TTIjMASSSLH DTVFDKILKS 900 
PMSFPDTTPT GRLMNRFSKD MDEUWRLPP HAENFMXJFP MVyFILVILA AVFPAVLLW 960 
ASIiAVGFFIIi UUFHROVQE I1KKVENV8RS PWPTHITSSM QGLGIIHAYG KKESCITYTS 1020 
SKGLSLSYII QIiSGLLQVCV HTGTETQAKP TSVELIiREYI STCVPECTHP LKVGTCPJOJW 1080 
PSCGBIXFRD YQMRYRDNTP LVIjOSLNIMI QSGQTVQIVG RTGSGKSSLG MAliPRLVBPA 1140 
SGTIPIDEVD ICILSLEDIiR TKLTVIPQDP VI.FVGTVHYN LDPFESHTDE MLWQVLERTF 1200 
MRDTIMKLPE KLQAEVTEtTG EHFSVGERQL LCVARALLSN SKIILLDEAT ASMDSKTDTL 1260 
VQNTIKQAFK GCTVLTIAHR lOTVLSCDHV IiVMENGECVIE PDKPEVLABK PDSAFAMLIA 1320 
AEVRL 

Seq 10 NO: 514 DHA sequence 
Nucleic Acid Accession #: Z31560 
Coding sequence: 1-966 

1 11 21 31 41 51 

I I 
CaCAQOOCCC GCATGTACAA 
ACTTCGGGGG GOGGOSGCGG 
AQCCCGGACC GCGTCAAGCG 
CGCAAGATGG CCCAGQAGAA 
GCOGAGTGGA AACTTTTGTC 
CTOCGAGCGC TGCACATGAA 
AAGACGCTCA TGAAQAAGSA 
AATAGCATGG CGAGOSGGOT 
ATGGACAGTT ACGCGCACAT 
CAGCTGGGCT ACCOGCAGCA 
ATGCACCGCT ACGACX5TGAG 
ATGAACGGCT CX3CCCACCTA 
CTTGGCTCCA TGGGTTCGGT 
TCTTCCTCCC ACTCCAGGGC 
TATCTCCCOG GCGCCQAGGT 
CACTACCStfSA GCGGCCOGGT 
ATGTGAGGGC CGGACAGC6A 
IGGGAGQGGT GCAAAAGAGQ 
AAAAA 

Seq ID HOi 515 Protein sequence 
Protein Accession Ht CAA83435 

1 11 ai 31 41 51 

I I i ) I I 

HSARMYNMME TELKPPQPQQ TSGGGGGNST AAAAGCBIQKir SPDRVKRPMH AEMVHSRCQR 60 
RRMAQEHPKM HKSEISKRLG AEWKLLSETB KRPFIDBAKR UtAIJiMKBHP OYKYRPRRKI 120 
KTU4KKDKYT LPGGIaLAPGG NSMAS6VGVG AGLGAGVNQR KDSYAHMNGW SNGSYSMMQD 180 



CATGATGGAG ACGGAGCTGA AQCCGCCGGG CCOSCAGCAA 60 

CAACTCCACC GCGGOOQCGG CCOGGGGCAA CCAQAAAAAC 120 

GCCXaiTGAAT GCCTTCATGG TGTGGTCCCG OGGGCAGCGG 180 

CCCCAAGATG C31CAACTCGG AGATCAGCAA GCGCCTGGGC 240 

GQASACGQAQ AAOCaGCX^aT TC3UrOSACGA GGCTAAaaSG 300 

GQAOCACCCG QATTATAAAT ACCGGGCXX30 QOOGAAAACC 360 

TAAGTACA06 CIGOCCGGOG GGCTGCIGGC CC0CG6CG6C 430 

GGGGGrGQGC GCQGGCCTeQ aCaCGGOCQT OAACCAOGOC 480 

QAAOGGCTGG AGCAA0G6CA GCTACAGCAT GATGCAGGAC 540 

CCCGQGCCTC AATGCGCAC6 GCGCAGCGCA GATGCAGCCC 600 

CGCCCTGCAG TACAACTCCA TGACCftQCTC GCAOACCTAC 660 

CAGCATGTCC TACTCGCAGC AGGGCACCCC TGGCATGGCT 720 

GGTCAAGTCC GAGGCX»GCT CCBGCCXrCC TGTGGTTACX: 780 

GCCCTGCCAG GCCGGGGACC TCOGGGACAT GATCAGCATG 840 

GCCGGAAOCC GCOOCCCCCA OCAGACTTCA CATQTCCCAG 900 

GCCC!0GCAOQ GOAXTAAGQ GCACACIOOC CCTCTCACAC 960 

ACTGGAG6GG GOAaAAATTT TCAAAGMAA ACGAGGGAAA 1020 

AGAQTAAlSAA ACAGCATGGA GAAAACCCG6 TAGQCTCAAA 1080 



381 



20 



WO 02/086443 

QliGYPQHPGL MAHGAAC3MQP MHRYDVSALQ WSMTSSOTY 
LGSMGSWKS BASSSPPWT SSSHSRAFCQ AGDUIDMISM 
HYQSGPVPGT AINGTLPLSH M 

Seq ID HO: 516 DMA sequence 
Nucleic Acid Accession «: (J91618 
Coding sequences 29..S41 



tOKSSPTYSMS YSQQGTPGMA 



21 



31 



CGGACTTCGC 
CATGCTACTC 
AGC31TTAGAA 



TTGTTAGAAG 
CTGGCTTTCA 
QCAGATTTCT 
AAQATGACTC 



I I 
GCTGAAAGAT GATGGCAGGA 
GCTCCTGGAC TCTGTGCTCA 
TGACCAATAT GCATACATCA 
TGCTAAATQT TTGCAOTCTT 



ATGAAAATCC 
GATTCAGAAG 
AAGATTAGTA 
GTAAATAATT 



T6ACAAAAAT 
GCTGTATQAO 
AGAGAATAAA 
ATTATATTTG 
ATTGAATGTG 
TCTTCAAAAA 



TCATTTATTT 



ACTGGGASTT 
AAGTCATAAA 
QAAOACCCTA 
ACATGTGATT 
TGACAAACAC 
CTAATA6AAA 
AAATGGGGCC 



CATACTCAAA 
GTGATTCATC 
ACTTATCTGT 
TTAGACTAAQ 
GCAATT 



CCTTATATTC 
AGAGATTCTT 
ATCCCTTAAT 



51 
I 

AGCTTGTATG 
RGGAAATGAA 
AAGCACATGT 
TGAACAGCCC 
AACTTCCTAC 
ACAAAATCT6 
ATACTOOAAA 



TGAAACGGCA 
ACTATTACTG 
TAAATATCAA 



TGTTTTCAAA TAAATCTAAA 720 
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30 



I I I j ) ) 

MMAGMKIQLV CKLLIiAFSSW SLCSDSEEEH KALEADFI.-m MHTSKISKAH VPSHKKTUiN 
VCSliVNNUrS PABETGEVHE EBI.VARRXI.F TALDGFSI.EA MLTIYQIiHKI CHSRAFQRHB 
IiIQEOILDTG NDKNGKEEVI KRXIPYIIJai QIiYEHKPRSP YILKKSSYYY 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ACCTAAAACC 
ATGTATCCAO 
A6CA.TTGCAG 
OAACTCCCAT 



TTGCAAGTTC 
CAGGCTCAGT 
GTCCTATTTG 
TCCTOG6AGC 



ATAAAGATTT 
TCATATGAAA 
TACACCCTAC 
TTCCTACTGA 
GAATGGGCCC 
ATAAATGGGC 



QGATGCACCT 
AGTTTATCTT 
CTACAGAACC 
TTTCACCACA 
GTACSVGGCXa 
GCTGACAGAC 
ATTCATACCT 
CACCAAATTA 
TCAGCTAAAA 
AAACTGAATG 
CTTCTTGGCA 
CTGGGTTCAT 
TTCTTTGTTC 
TCTGGAACTG 
AAACCTCACC 

ATGrrrcTAO 

GGACGAAAAT 



CTTCATTTTA 
TAATACCTGC 
AGOCAAATGT 
AATACAGAGG 
ATGATAACTT 
ACCTCCQTTG 
AAAATCAAAT 
AAGGTCCTTG 
TTATCTACAA 
CTGTGGTTGA 
AGATGTGCAG 
GCTTTCCCAT 
GTGACAAAGT 
TCCTTCaUiCT 
TCGTGGGCAT 
ACaVGCAATQA 
CAGACATCAG 



21 
I 

AGGAAGAAAC 
GTGAGTGAAC 
CRACCTGAAG 
TGGASTACAS 
ACCTGAGAAT 
CCTATTTAAT 
CACATGGAAA 
CATAGTGACT 
GTGTGGAAAA 
AACAGCTGGC 



TAAAQTGACA 
CCCCCAAGAA 
TAGCACCCAA 
ATrTTGTAAT 



31 
I 

CATCTGCATC 
TGGAGGCTTC 
TTTGTGACTC 
CTTCRAGACA 
CAGAACCTCA 
GCTACCAASA 
GCTAATAATA 
GACTGGTATG 
GAGGGAAAAT 
TAOiGATCaVC 
GATGAGTATA 
AGGTQTTCAT 
AACTGTATTA 
AATGCAACTG 
GOUVGTACCX: 
GCATOGOATG 
GAGCTTCCAC 



ATGGGTATAA 
TCTCAAACAT 
GAAQAGTATT 
ACAGCAAAAT 
GGGCACATGG 
ACATTCATTT 



ACCTGACACA 
GACCCAAAGG 
CTTAAGTTCA 
TGGATTGCTC 



TTTCAGAAAT 



ACAATGACAA 
CTGACATCAC 
TTAGTAAGCT 
CATCAATAAT 
ACAACCAAGA 
TAATCACAGA 
CTGCTCCCAC 



CACACCTAAT 540 

GTTTGTCCAT 600 

ACCTTTCTAC 660 

ACGCATTTTT 720 



gttcatgcaa 
agc7m:caaac 
ctctgctgac 



ATTGCTTACC 
CTGCAGCCCC 
CftORTATATC 



ACAACAAGCC 
TGCCAGTTTC 
IGATCGAAAG 
CATTTGTTCA 
TGGCTCTGTG 
CACTOTGCTC 
AAATCTGGAG 
AAACTCCAAT 



GACAGCAAAQ 
TTGCTGGTTT 
GGGCTTAAGA 
ATGATATTAG 
AGCAGTGGTT 
GAATTATCAC 



ATTTGATQCA QATTSTTQAA 
GAGAGATCAG AGCCCAGCTA 
CATATCTGCC CACCACTGTA 
AAGGATTTQA GGTGGTTGAA 



TCTCTGCAAG 
GCCACTGTGG 
TATGCCAATG 



GAACAGCTAA 
CCCreAAAGT 
AAGCCTTTGT 



AAACACAGTG 
GG CCAGTG OT 
TAATTTTATC 
GCCTGGGCAC 
GACAOTGACC 



ATTCAOCTTa 
ACIGTOSATA 
CCTCCTCAGA 



ATOATGOAAT 
AASTOCATGT 
ATGCTATGTA 
AATCAGTAGG 
GCTCCTTTTC 
AAATTATTOA 
CTGGAGAAGA 
TACAGAATAT 
CTCAGCAAGC 
CTCAACATCA 
CAATGGATAO 
CCTCTGTTTA TTCCCOCCAA 



tgatcctgat 
agctagtctt 
taccx:atc3vt 
tgtgccccca 
tgtqatgatt 
tgccacagtt 

AGGIGCTGAT 

TXACTOGAGO TATTTTTTCT CCTITeCTOC AAAXGGTAOA 



: ATTCTTAATG 



CAACAATTCA 
GTCTTACAGQ 
ATGCTTTCAG 
AAASTACAGG 
ATACTGTG66 
TTATATTATT 
CTTTTCGGAC 
CCCTGAACAA 
CCaACTCAGC 
TTCCTCATCC 
CCACTCTCAC 



TOAAAATGTC 



CAATCACTCI 



GCTCCAAGGA 
AGCTCAGQAG 
CCACCATGCA 
TGGACAGCAC 
AOTAAAAGTC 
AAGOQAAATC 



CAGAAATGAG 
AGTGCTGGGA 
CCTGGAAGCT 
CTTTGATCAG 
CCAAOATOAC 



GCOUUIIGGA 
QAACTCCTTA 
TTCTQATCCT 
TTTGATA6QA 



CCCftGCATAA 
TACACAQCaA 
6AGGA6C6AA 
GTTCCAGCTG 
GTAAAAGTAG 
GGCCAGGCTA 
TTTAACAATG 
GAOATATTTA 
GAAACACATG 



AfiTGGGGCTT 
GCCCCCACCC 
AAGAGGAATT 
CAAGCTATGA 
CTATTTTAGT 



CCACTCTATT 
TCAQATGAAT 
TAGCCX3ASTC 
TGATGTGTTT 
GACCCTATCT 
AATAAGAATG 
AAATACATCA 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



1980 
2040 
2100 
2160 
2230 
2280 
2340 
2400 

2520 
2580 
2640 
2700 
2760 
2820 
2880 



382 
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Ci^TACTTTAA GCASGAAAAA 
ATAAATATCC AAAGTGTCIT 
CATACTAACA AAQTCAAATT 
ATACAGATAA GATTTTTACA 
CCTTACACTT TGGCTATGAA 
GCAAAGGGAA GGGTAAAGTC 
AATAGCCCCA A GCAGA GAAA 
TCATTTAGTT ACTTTGATTA 
TTTACATGAA GATCATGCTA 
CTTGCTATTT TGTTATATAT 
TTTCACTGTA AGAGGTAACC 
TTTATGftCAA AGGTCTATTQ 
TTTCTAAC3TT TATTOCCTTG GGTTATTATG 



GAGAGCAGAC AAGAAAGAGA ATSQAACAA; 



CCTTCTTAGA 
AACATCAAAA 
TGGTAGATCA 

CAAATAATAA 
GGACCAGT6T 
AGGAGGGTAG 



CTOTATTAAA 
ACAATTCTTT 
AAATTATTCT 



TATTTTATAT 
ATTTCAGATG 
TTTAACAATA 



GTCTGCATTA 
TCTCCTTATC 
ATQTAGCCCC 
ACATCrCCCT 
TGGGTATTAC 
THTGTAASTT 
GAATGATAOT 



ATGGCCTT06 
ATGCATTGAS 
TTGGGG6TAG 
TTAAAGTAAT 
TTGTTTTATT 
TAACTGTCTC 
TGTGCAGTAC 



TCTACTCCCA 



2940 
3000 
3060 

ATTAGAAAAC 3120 
3180 
3240 
330C 
3360 
3420 
3480 
3540 
3S00 
3660 



PCT/US02/12476 



GAGGTGGAAA 
TGTGAAGCAA 
AGGTTGCTTG 
CTCTTTACCT 
AGAGATCTTT 
TCATACCGGT 
TCAAAQCAGC 
TATAATGCCT 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GDDPYTLQYR 
KPFYIUaaNQ 
MFMQSI.SSW 
TFSLVQAODK 



GCQKEGXYIH 
IKVTRCSSDI 
EFCHASTHNQ 
WCLVU3VSS 
DDBICU.VSYL 
PTWLSSGSn 



LTAGYGSRGR 
CPQGNCIISK 
SIiRSAWDVIT 
LQQAABFYLM 
SICSGLKKGP 



VFVHEHABLR WGVFDByHHD 



HTQRSIAQFI CMLKFVTIiLV AI^SBLPFLQ AGVQIiQDNQY NGUiIAINPQ V 

FPiaiIKII.IP " 

FTPlfFLIiHDM 
TGIFVCBKGP 
EAPNIiQNQMC 
KMABADRIJ.Q 
PTTVSAKTDI 
HSIALGSSAA 
GEHVKPHHQI. 



FDPOGRKYYT 
AVPPATVEAF 
AGADVIKNDG 
IQMNAPRKSV 
LTliSWTAEGE 



V KPGHWTYTUI 



lYSHYFPSEA ANGRYSLKVH 



IiILECGVLTAM 



DFDQGQATSY EIRMSKSLQN 
QPHGETHESH RIYVAIRAMD 
GLIGIICLII WTHHTIiSRK 



VNHSPSISTP 
SVLGVPAGPH 
IQDDFNNAIL 
SNSIiQSAVStr 



QIVBIHTFVG 
EWEKIiNGKA 
GGUCFFVPDI 
caiDTMFLVTW 
NTHHSWJAIJC 
TATVEPETGD 
AHSIPQSHAM 
PDVFPPCKII 
VNTSKHNPQQ 
lAQAFIiFIPP 



YQSVMILVTS 
SNSNSMIDAF 
QASGPPEIIL 
VTVTSRASNS 
PVTLRLLDDG 
YVPGYTANGH 
DLEAVKVEEE 



HSOFVFABDY 900 



Seq ID NO> 520 DMA aetjuence 

NUclelc Acid Accession «: NM_00022a.l 

Coding sequence: 82.. 3600 



GCTTTCAGGC 
GGATCACCCC 
CTCCTGCATG 
CTTGTTGGGA 
ACCTACTGCA 
CCTCACAACT 
TGGTGGCAGT 



31 



41 



GAGCGCTCCT 
ACCTCCACCT 
CAGTCCCTGC 
ATGGATTTAG 
ATCACAAACT 

ccrrccAGCG 

GGCCATGCTG 
CA6GTCCACQ 



I I 
GATCTGGAGA AAGAAGOOCA GAACACACAS CAAGOAAAC 
ATTGGCTGAA GATGAGACCA TTCTTCCTCT 1 
CCCAACAAGC CTGCTCCCOT GGGGCCTGCT f 
GGACCCGGTT TCTCaSAflCT 1 
CXXAGTATGG CGAGTGGCAG ATQAAATGCT GCAAGTGTGA 
ACTACAGTCA CCGAGTAOAG ARTGTGGCTT CATCCTCCGG 
CCCAGAATGA TGTGAAOCCT GTCTCTCTGC AGCTGGACXrT 
AAGAAGTCAT GATQGAGTTC CAGGGGCCCA TGCCCGCOGG 
CAGACTTCSG TAAGACCIGG CQASTGIACC AQTACXZTGGC 



CCTGCCTGGC 
TGGGQACCTG 
CAAQCCTGAG 
CTCCAGGCAG 
CCCCATGCX3C 



CATQCTGATT 480 
TGCCGACTGC 540 



TAATGCACGC 
TCCAGCAACT 
TTTCACCAGG 



CCTACTATGC 
ATCGCTGOSC 
ATGTCTGTGT 
ACAACAACCQ 



ACCCAAGCCT 
CTGCCASCAC 
GCCXriGSAQA 



CTAAATGGGG 
CAAAGTCAAA 
CTGGCCCCTG 
CTCCGTCTGC 
GGGGCCTCTG 
AACACIGCC6 



GCCAGCCAG6 
AACTGtGAGC 
GAGAOCTGCA 



AAGCCGGGCT 
AACATCCTGG 
CTGOCCAACG 
AGI^GGCCAGG 



GGTSTCAGCT 
TCTCCTGCGA 
GGCAGTGTGT 
TCACTGGACT 
GGTCCCGQAG 
TGGTGQ6TCC 
GCT6TGAACC 



AGOTGTOIOT 
GCACTATTTC 
GTGTGATCCG 
GTGCAAGGAG 
CACCTAOSCC 
GGACATGCC6 
CAAAT GTCA C 
GTGTGOCTGC 



ACATGTCACT 
QACRATTGCC 
CGGAACCGGC 
6ATGGGGCAG 
Ca^TGTGCAGG 
AACCKGCAGG 
TGTGACGAGG 
CAGTGTGCTC 



AGGGGAGCTG 
CAGGCCCCTC 
GCCCAAATTG 
GCCAGGACGC 
TTGACCCCGC 
GGGACCRCAC 
GCCCGGGAGC 



V ACTTAACCTT 

^. GGTGGGGGAG 
CTACCAC 



GAGAGCGCTG 



GCAGCCKTCC C 
TGTGACTGTQ P 
CTCTGCCX5CC C 
CGCTACCOGG 1 
GAGCAGGCXrC 1 
GGGCIGGAGG A 



3 GAAGQCTTSXi 



C CGGGCCCX3GC 
C CTQCCACOCT 
3 TAGACTCCX3C 
r GGCCTCCCGG 
3 CCXXJQCAOTC 



GCCATCCTCT CCCTCAGGCQ f 
QASAOGTTGT CCCTTCCX3AO t 
ACTAIOTATC ACAGGAAGAO GaAGCAOITT 

AGCCTAOGAG 

CCAGCTCAGG 
AGGCACCGGC 
GACACCCACC 
ATGCCCTGGT 



TGCTTCCAGA 
AATGCCACCG 
ATCCTAGATG 
ACAGAGCAGG 



AGAGTGGGCG 
CCTACCACTG 
ACTCCCCTCA 
GTGGCCTGAT 



CCTATGATGC 
CCAGCCTGT6 
CAAAGAGTAA 
AGGTGGCTC3V 



CACasCTGTG 
TGAGCGCTGT 
CCATGAATGC 
TGTGTTTGCC 
CGAAGGCAAG 
TTCCATTCAG 
TCCCTGTGAC 
TGACCTATGC 
CTGTGACIGC 



GAAGCTGGCC 
GCCCACAGTQ 
GTGCAGOGCT 
ATGCCGAGCC 
AGGCCGCTGC 
CTACTGCAAT 
GGACCTCCGG 



ATGTCTTCGT 
ATGGCTTGCA 
TGTGGCTCCC 
CAGGTGGCTG 



TGCCTGACCT 
CCCCAATATC 
GCTGCAGGGG 
AGCAGCTGOG GGGCTTCAAT C 



MiICTTGACSl 
OAAAAAATAA 

CAGTCAGCXX: 
QACAGCCGGA 
AGCCCCAAGC 
TTCAACAAGC 
GAaCTATGTC 



TGGATCTGCC 
GAAGCTTCAA 
GCAGTGCTQA 
AGGCTGCTCA 



TGGTCTCXrrT 
TCCTTCAGGA 
GCAGGTCTCC 
GAGGCTGGTG 



TCTGTGGCAA 
CCCAAGACAA TGtSCACAGCC 
GGQCCTTCTT GATQGCQGGO 
AGCGGACCAG GCAGRTGATT 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



2340 
2400 
2460 
2520 
2580 



383 
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AGGGCAGCCG AGGAATCTQC CTCACM2ATT CAATCCRGTG CCCftGCGCTT GGAGACCCAG 2700 

QTGAGCGCCh GOCGCTCCCA GATGGAGGAA GftTGTCAGIVC GCACACGGCT CCTAATCCaiG 27S0 

CAGGTC06GG ACTTCCTAAC ASACCCCGAC ACIGATGCAG CC»CTATCCA GGAGGTCAGC 2820 

GAGGCCGT6C TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTGTTCTGCA GAAC3ATGAAT 2B80 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGRCCAAG 2940 

CAGGACATTQ CXSCGTGCCCX3 CCGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGCCGAGCC 3000 

CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACXa.GCCGCT CCCTTCX3GCT TATCCAGGAC 3120 

AGGGTTGCTG AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGGCAGM3G CAGTCX»GGC CCAGC3>GCrT GOGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

OCCCAAOAGa GATTTGAOAa AATAAAACAA AAGTATGCTQ AGTTGAAGGA CCGGTTGGGT 3360 

CAGAGTTCCA TOCTGGGTGA GCAGGGTGCC CX3GATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTTG GGGAGACCAT GGAGATGATG GACAGGATGA AAGACAXGGA GTTGGAGCTG 3480 

CTGOGGGGCA GCCAGGCCAT CATGCTGCGC TCXMCXSOACC TGACAGQACT GGAQAAGCGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG CGCGTGCTCT ACTRTGCXSiC CT6CAASIGA 3600 

TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCTTTTO 6TTGGGQGCA 3660 

GATTGGGTTG QAATGCTTTC CATCTCCAGG AGACTTTCAT GCAOCCTAAA STACAGCCTS 3720 

GACCACCCXrr GGTGTGTAGC TAGTAAGATT ACCCTQAGCT GCA6CTGAGC CTGAGCCAAT 3780 

GQGACAGTTA CACTTGACAG ACAAAOATaQ TQQAQATTGa CATG CCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCIGGGCAG TATCCCCCGC CTTTAGTTCt CCACTGGGGA 3900 

GGAATCCTQQ ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3960 
AAAATCTTTG G 



1 11 21 31 41 51 

I 11 1 I I 

MRPFPLLCPA LPGLIiHAQQA CSRGACYPPV GDLI.VGHTHP I1RASSTCGI.T KPETYCTQYG 60 

EWQMKCCKCD SRQPHNYYSH RVENVASSSG PMRWHQSQMD VNPVSK}LD1, DRRPQLQEVM 120 

HBFQGPMPAG MLIBRSSDFG KTHRVYQYLA. ADCTSTFFRV RQGRFQSHQO VROQSLPQRP 160 

NARUfGGKVQ LNUfDLVSGI PATQSQKIQB VGBITNLRVN PTRI.APVPQR GYHPFSAYYA 240 

VSQUILQGSC FOIGHADRCA PKFGASAGPS TAVQVBDVCV OQEOITAOGMC ERCAPFYiniR 300 

PWRPAEGQDA HECQHCDCUa HSBTCKFOPA VFAASQGAYG GVCDNCROHT EGKNCERCQIt 360 

HYFRNRRPGA SIQETCISCB GDPDQAVPOA PCDPVTGQCV CKEHVQQERC DLCKPGFTGIi 420 

TYANPQGCHR CDCNILGSKR DMPCDEESOR CLCI,PNWGP XCDQCAPYHW KIASGQGCBP 480 

CACDPHNSPQ PTVQPVHRAV PCHEGFGGLM CSAAAIRQCP DRTYGDVATG CRACDCDFRG S40 

TEGPGCDKAS OHCLCRPGLT GPRCDQCQRQ YOJRYPVCVA CHPCFQTYDA DLREQAljRFG 600 

RLRNATASLW SGPGLEDRGL ASRILOAKSK lEQIRAVLSS PAVTEQEVAQ VASAILSLRlt 660 

TLQGWJLDLP LEBBTLSLPR DLESLDRSFN GLLTftYQRKR EQPEKISSAD PSGAPHMLST 720 

AYEQSAQAAQ QVSDSSRIiLD QLRDSRRGAB SLVRQAGGGG GTGSPKLVAL RLEHSSLPOL 780 

TPTFKKLCGN SRQMACTPXS CPOELCFQrai GTAOGSRCSG VLPRAGGA7L MAGQVAEQLR 840 

aPKAQLQRTR QMIRAAEBSA SQIQSSAORL EIQVSASIISQ MEEDVRRTRL LIQQVIUlFIiT 900 

DPDTDAATIQ EVSEAVt>ALM LPTDSATVIiQ KMHBIOAIAA RLPIIVDLVLS QTKQDIARAR 960 

RLQAEAEEAR SBAHAVEGQV EDWQNIiRQQ TVALQEAQDT HQQTSSSUili IQDRVAEVQQ 1020 

VLRPAERLVT SMTKOIiGOFH TRMBRTiWHQA RQQGABAVQA aQLAEGASBQ ALSAQSGFER 1080 

IKQKYAEIiKD SI.GQSSMI1OE gOARIQSVKT BAEEI.FGBIM EMMDRMKDHB LEUiRGSQAI 1140 
KLBSKDVrOL EKRVEQIRDH IKGRVLYYAT CK 

Seq ID NOt 522 DHA sequence 

tfucleic Acid Accession 9: 11M_001944.1 

Coding sequence: 64.. 3083 

1 11 21 31 41 51 

I I I I I I 

TTTTCTTAQA CATTAACTGC AGACGGCTGG CAGGATAGAA GCAGCJGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAQ ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG • 120 

OCATCTTCOT GQTGGTCATA TTGOTTCATG GAGAATTGCG AATAGAGACT AAAGGTCAAT 180 

ATQATGAAGA AOAQATGACT ATGCAACAAG CTAAAAGAAG GCAAAAACGT GAATGGGTGA 240 

AATTTGCCAA ACCCTQCAGR OAAGGAGAAG ATAACTCAAA AAOAAACCCA ATTGCCSAGA 300 

TTACTTCAGA TTACC3«GCA ACCCAGAAAA TCACCTACXS3 AATCTCTGGA GTGGGAATCG 360 

ATCAOCOQCX: TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 

CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 480 

AAGGACTAGA TOTAQAGAAA CCACTTATAC TAAOBGTTAA AATTTTGQAT ATTAATGATA S40 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGerSAAAT TBAAGAAAAT ACrtOCCTCAA 600 

ACTCACTGGT GATGATACTA AATGCCACAG ATGCAGATQA ACCAAAOCAC TTaAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCAGGCaC ACCCATGTTC CTCCTAAGCA 720 

GAAACACTGQ GGAAGTCCST ACTTTQACCA ATTCTCTTQA CCXSAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAA6 ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAOGATA ACTtCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTQAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGQTT TQAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTG6X6A 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGCX3 TOAAACTTAO TATTOCTGTC AAAAACAAAG 1X40 

CTGAATTTCA CCAATGAGTT ATCTCTCXSAT ACOGASTTCA GTCAACOCCA STCftCAATTC 1200 

AGGTAATAAA TGTAAGAGAA GOAATTCCAT TOCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGGCAT AAGTAGCAAA AAATTQGTGG ATTATATCCT GGGAACaiTAT CAAGCCATCG 1320 

ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGACGT AACGATGGTG 1380 

GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTOVAAAAT ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATAC3V 1500 

OGGGTAAAAC TTCTACAGGC ACX3GTATATG TTAGAGTACC OGATTTCauiT GACAATTGTC . 1560 

CAACASCTGT CCICGAAAAA GATQCAOrTr GCAGTTCrTC ACCTTCCGIG GTTGTCTCOG 1620 

CTAOAACACT OAATAATAGA TACACTGGCC CXTTATACATT TGCSkCTOGAA GATCAACCTG ISBO 

TAAAGTTOCC TGCCGTATGO AGTATCACAA CCCTCAATGC TACCTCXXXX: CTCCTCSUJAG 1740 

COCAGGAACA OATAOCTCCT GGAOTATACC ACATCTCCCT GGTACTTACA GACBCTCRGA 1800 

ACAATCGGIG TBAGATGCCA CGCSkSCTTGA CACCGOAAGT CTGTCAGTGT GACAACAGGG 1860 
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GCATCTGTGG AACTTCTTAC CCAACCACAA C3CCCTGGGAC CAGOTATGGC AGGCCGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT TCGTCTCCTG CTGCTGCTGT 1990 

TGGCCCCCCT TCTGCTGTTQ ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GOAATTGAAG 2100 

GAGCCCATCC TOAAOACAAG GAAATCAC3VA ATATTTCTGT GOCTCCTGTA ACAOCCAATG 2160 

GAGCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCftGA GGCACAGCX3G 2220 

TGGAAGGCAC TTCAGGAAT6 GAAATQACCA CTAAGCTTGG AGCA6CCACT GAATCTGGAG 2280 

GTGCT6CAGO CTTTGCAACA GGGACAGTST CAGGAGCCGC TTCAGGATTC G6AGCA6CCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCM3T CTGQAACCAT GAOAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGCGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2520 

TGITCATCTA TGATAATGAA GGCGCAGATG CXaVCTGGTTC TCCTGTGGGC TCCGTGGGTT 2580 

GrrGCAGTTT TATTGCTGAT GACICTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2640 

TTAAAAAACT TGCAGAGATA AjGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGOQGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAOA 2760 

CAGOATTTGT TAAGTGCCAO ACTrTGTCRG 6AABTCAAGG AGCTTCTQCT TTGTCOGCCT 2820 

CTGGGTCrGT CCAGCCAGCT GTTTCCATCC CTGACCCrCT GCAOCATGGT KA CTAT TTAO 2880 

TAAOGGAGAC TTACTCX3GCT TCTGGTTCCC TCGTGCAACC TTCCACTGCa OOCTTTOATC 2 940 

CACTTCTCAC ACRAAATGTO ATAGTQACaG AAAGGGTGAT CTGTCCGATT TCCAGTGTTC 3000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CCGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

•IGGCACTTAT TAGCTTCTCT CATAAACTGA TCACX3ATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCakAAACC CTAAAATCAT ATTCQC 



MMGLFPRTTG ALAIPVWIL VHGBLRIETK GQYDEEEMTM QQAKRKQKRB WVKPAKPCRB 60 

GEDHSXRKPI AKITSDVQAT QKITYRIS6V GIOQFPFGIF WDKHmSDVH ITAIVDRBET 120 

FSFLITaUU. HAQGLDVEKF IiIIiTVKILDI HDHPFVFSQQ IPMSEISEHS ASNSbVHIIfl 180 

AXDADBFKHIj NSKIAFKIVS QEPAGTPHFIi IiSKHTGEVRT LTSSLDREQA SSYHLVVSGA 240 

DKDGBGIiSXQ CBCHIKVXDV NDNFPHFROS QySABIEENI LSSBLIiRFQV TDIiOEEYTDN 300 

WAVYFFTSG NBGNWFEIQT DPRTMEGILK WKALDYEQI. QSVKLSIAVK HKAEFHQaVl 360 

SRYRVQSTPV TIOVINVHEG lAFRPASKTF TVQRGISSKK LVDYILGTYQ AIDEDTNKAA 420 

SNVKSVMGiar DGGYUMDSK TAEIKFVKHM NRDSTFIVUK TITAEVIiAID EYTGKTSTGT 480 

VYVUVPDFHD NCPTAVIiEKD AVCSS3PSW VSARTLNNKY TGPYTFALED QPVKliPAVWS 540 

ITTIiNATSAL IiRAQBQIPPG VYHISLVLTD SQNHRCBMPS SLTLEVCQCD HHGICGTSYP 600 

TTSPGTRYGR FHSGRLGPAA IGLLLLGLIali LLLAPLIjIjIjT GDCOAGSTGG VTGGPIPVPD 660 

GSBGTIHQHG IBGAHFEOKB ITNICVPPVT AHGAOFMBSS EVCTNTVARG TAVEGTSGME 720 

MmauOAATE SGGAA6FATG TV8GAASGFG AATOVQICSS QQSGTMRTRK STGG'nTKDYA 780 

DGAISMNFLD SYFSQKAFAC AEEDDGQEAN DCUiXVUSBB ADATOSPVOS VGCXISFIADD 840 

MDSPLDSLG PKFKKIAEIS LGWDGEGKEV QP PSKDS GYG IBSOSHPIBV QQTGFVKCQT 900 

LSGSQQASAL SASGSVQPAV SIPDPI.QH(ai YLVTETYSAS QSIiVQPSTAG FDPUiTQNVI 960 
VTKRVICPIS SVPGNIiAGPT QLiRGSHTMLC TEDPCSRLI 

Seg ID NO: 524 DNA Beguence 

Nucleic Acid Accession Si XM_058069.2 

Coding sequence: 1..1413 

1 11 21 31 41 51 

ItGRAGTTTC TTCTAATACT GCTCCTGCAG GCCACTGCTT CreOAGCTCT TCCCCTGRAC 60 

AGCTCTACAA GCCTGQAAAA AAATAATGTG CEATTTGGTG AAAGATACTT AOAAAAATTT 120 

TATGGCCTTC AGATAAACAA ACTTCCAGTG ACAAAAATQA AATATAGTGQ AAACTTAATQ 180 

AAGQAAAAAA TCCAAGAAAT GCAGCACTTC TTGGGTCTGA AAGTGACCXK3 GCAACTGGAC 240 

ACATCTACCX: TGGAGATQAT GCACGCACCT CGATGTGGAG TCCXXX3ATGT CCATCATTTC 300 

AGGGAAATQC CAGGGGGGCC CGTATGGAGG AAACATTATA TCACCTACAQ AATCAATAAT 360 

TACACACCTG ACATGAACXM TGAGGATGTT GACTAOGCAA TCCGGAAAGC TTTCCAAGTA 420 

TGGAGTAATG TTACXECCTT GAAATTCSiGC AAGATTAACA CAGGCATGGC TGACATTTTG 480 

GTGGTTTTTG CCCGTGGAGC TCATGGAGAC TTCXaTGCTT TrGATGGCAA AGGTGGAATC 540 

CTAGCCCATG CTTTTGGACC TGGATCTGGC ATTGOAGGGQ ATGCACATTT OGATGAGGAC 600 

GAATTCTGGA CTACACATTC AGGAGGCACA AACTTGTTCC TCACTG CIOT TCROSA6ATT 660 

GGCCATTCCT TAGGTCTTGG CCATTCTAGT QATCCAAAGO CCGTAATGTT CCCCACCTAC 720 

AAATATQTTG ACATCAACAC ATTTCGCCTC TCTGCTGATG ACATACGTGG CATTCAGTCC 780 

CTGTATGGAG ACCCAAAAGA GAACCAACGC TTGCCAAATC CTGACAATTC AGAACCAGCT 840 

CTCTGTGACC CCAATTTGAG TTTTGATGCT GTCACTACCG TGGGAAATAA GATCTTTTTC 900 

TTCAAAGACA GGTTCTTCTa GCTGAAGGTT TCTGAGAGAC CAAAGACCAG TGTTAATTTA 960 

ATTTCTTCCT TATGGCCAAC CTTGCX»TCT GGCATTGAAG CTGCTTATGA AATTQAAGCC 1020 

AGAAATCAAG ' ITriT C T TTT TAAAGATGAC AAATACTGGT TAATTAGCAA TTTAAGACCA 1080 

GRGCCAAATT ATCCCAAGAG CATACATTCT TTTGGTTTTC CTAACrTTGT QAAAAAAATT 1140 

GATOCAGCTG TTTTTAACCC AOGTTTTTAT AGGACCTACT TCTTTOTAOA TAACCASTAT 1200 

TGGAGOTATG ATGAAAGGAG ACXGATGATG GACCJCieOTT ATCCCAAACT GATTACCAAa 1260 

AACTTCCAAG GAATCGGGCX: TAAAATTGAT GCAGTCrTCT ACTCTAAAAA CAAATACTAC 1320 

TATTTCTTCC AAGGATCTAA CCAATTTGAA TATGACITOC TACTCCAAOQ TATCACCAAA 1380 
ACACTGAAAA GCftATAGCTG GTTTGGTTGr TGA 



I I I I- I i 

HKFIilLIiLQ ATASOAIJU* SSTSIiEiamV IiEGERYLEKF 1CSI.BniKI.PV -rKMKXSGKUt 
KBKXQEKQHP LGLKVTGQIiD TSTLEHMHRP RC3GVPDVHHF RBMPGGPVWH KHyiTYKINM 
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YTPDMNREDV DYAIRKAFQV WSNVTPLKFS KINTGMADIL WFARGftHGD FHAFDGKGGI 
UffiAFGFGSG IGQ»BFOED ETNTTHSGGT NLFItTAVHEI ^SLGUaiSS DFKAVHFPTY 
KZVDINTFRL SADDISQIQS LYGDPKENQR LFNPDHSEPA LCDENLSFDA VTTVGIIKIFF 
FKDRFFMUW SERPKTSVNIi ISSLWPTLPS GIEAAYErEA RNQVFLFKDD lOWLISNIiRF 
EPHYPKSIHS FGFPNFVKKI DAAVFNPRFlf RTYFFVDHQY HRYDERRQtm DPGYPKLITK 
NFQGIGPKID AVFYSKNKYY YFFQGSNQFE YDFLLQRITK TLKSNSWFGC 

Seg ID NO: 526 DHA sequence 

nucleic Acid Accession «: lJM_0244a3.1 

Coding sequence! €4.. 2590 



PCT/US02/12476 



11 



21 



31 



41 



51 



I I I I I 

GGCAGGTCTC GCTCTCXSGCA CCCTCCOGGC CX:CCXKX3TTC TOCTGGCCCT GCCC3GGCATC SO 

CCGATGGCCX3 CCGCTGOGCC CCGGCX3CTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAOA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTQCAGACX: TCATCCGGTC AAGTGATCCT GATTTCAfiAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GOaCTGTTGC GCTQTCX«3AT AAQAAAA6AT CATTTACXaX ATGGCriTCT 360 

GACAAAAGGA AACAGAC»CA GAAAGAGOTT ACTGTGCIGC TAOAACATCA 6AAQAAG6TA 420 

TOGAAOACAA OACACACTAG AGAAACXGrT CTCMGGCGTG CCAAGAGQAG ATGGGCACCT 480 

ATTCCTTOCT CTATGCAAQA OAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAQAACCTT TAAATTTGTT TTATATAGAA AGAGAC3VCTG GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGAT UT l' lTT GATTTGATTa CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGAOWIC 780 

CACCCTGTTT TCACAGAAGC AAXTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGT6G GOaUGaPrTa TGCXaCAGAC AGAGATGAAC CGGACACRAT GCATACGOSC 900 

CTSAAATACA OCATTTTQCA GC3USACACC3V AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCnCAGGOS TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCaVCATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTAC GAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCaTTTT AAAGGOAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGT6T TCTTTCTSTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGQAAATTGG AGTAAACAAT 13 BO 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TOAACASAGC CTTGGTTACA 1440 

GTTCS^TGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGOSG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAQ ATCAAOGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACakAA AAArTGCATG ATCXTTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTC3M3GGTC AATCATAACT TOCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTQGCAATAQ ACAAAGATQA TAGATCATGT 1740 

ACTGGAACAC TTOdGISAA CATTGAAQAT GTAAATGATA ATOCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATQ GGGTATACCQ ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGQAGC TCCATTTTAT TTCAGTTTGC caUVTACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT QATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTQGAT TTCAACaATA TACCATTCCP ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCRACAAAAT TATTGAOAOT TAATCTGTGT GAATGTACTC ATCXaVACTCA GTGTCGTGOS 2100 

ACTTCAAGGA GTACAG6AGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACIGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCRCAG CAAAACTTAA TTATATCAAA CACAQAAGCA 2280 

CCTGGAGAOB ATAGAGTGTG CTCTQCCAAT GGATTTAIGA COCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGQTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGQACT CCTGCAGGGG AGGACACACXS GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGQ ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATCQ ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGCTG GTTCTGTGGG 2700 

CTGCTGCAGT GAAAABCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTAC^ TTAGCAGAAG CATGOUaAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

TITOrCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACaiGTA 2880 

TOTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2940 

AQCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTA AA 3000 

TCTC3UVACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTTTTACG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AOTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTC 3180 

AGTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTGCAGC TCATAAAGAA TTGGGACTCA CCCCTACTGC ACTACCAAAT TCATTTGACT 3300 

TTGOAGGCAA AATGTGTTGA AGTGOCCTAT GAAGTAGCAA TTTTCTATAG GAATATAGTT 3360 

QGAAATAAAT GTOTGlXiTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAOAOG AAAATGOTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAG AGAOCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

GAGGAAATAO TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTA AATA AATTAAACTT 3600 

TTCTGGTTTC TGTGGGAAGG AAATAGGGAA TCCBATGGAA C3W3TAGCTTT GCTTTCCa^GT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGA ATACT CGCTGCAGCT 3720 

GGGGTTCCCT GCTTTTTGGr AGCAAGGGTC CAGAGATOAG GT3TTTTTTT aSGGGBGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCCTAG 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3960 

ATTCTCXTTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 4020 

TTTCaTTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTQAATC AAGGAAAGCC 4080 

AGQCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CXrrrAAGTGA CTCCAGGTTT TCCACC3VTCC TTCAGCX3TGA ATTAATTTTT AATCaGTTTa 4200 

CTTTCTCCAG AGAAATTTTA AAATAATASA AGAAATAQAA ATTTTQAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAQAGG GAACTTT0G6 AGAAAOCAGC CCAAGTAG6T 4320 

TATTTGTACA GTOUSAGGGC AACAGGAAGA TGCAGGCCTT CAA BGGCAAG GAOAQGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT CGTCTGCrTC ATACTTTETC CTAGQCTTGG 4440 
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CACTGCCTTT TCCTTTCICJV GGCCftATGGC AACTGCCATT TGAGTOCGGT GAGGGATCAG 450 0 

OCAACCTCTT CTCTATGGCT CftCCTTATTT GOASTGASAA ATCAAG6AGA CAGAGCKSAC 4560 

TQCATGATCA 6TCTGAAGGC ATTTQCASGA TOAOCXrrGAA CTGGTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTQGGCA CTAAOAAGaT 46 BO 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACRTTTTCTG TTTTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACX: ACGCCCGGCT AATTTTTTGT ATTTTTAATA QAGAOSGGGT 4980 

TTCACXGTXST TAGCCAGQAT GGTCTCGATC TCCTCACCTC GTGATCCGCC TGCCTCGGCC 5040 

TCCCAAAGTG CTOGGATTAC AGGCATGACC CACCQCTCCC GGCCTTOTTT TCCGTTTAAA 5100 

GTOGTCTTCT TTTAATaXAA TCATTTTGAA CRTGTGTGAA ASTTOATCAT ACXSAATTGGA 5160 

TCRATCTTGA AATACTCAAC CAAAAGACAG TCOAGAAGCC AGGGGGAGAA AOAACTCAGG 5220 

GCACSUkAATA TTGGTCTGAG AATGOAATTC TCTGTAAGCX: TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTG TTCTT 5460 

TGAACATGCT GAAAACXavCC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATQAAAAT TTAATTTTAa GGATTCATTT CTATATTTTC ACATATGTAG TAT TATTA TT SS80 

TCCTTATATO 1GTAAGGTGA AATTTATGGT ATTTGAQTQT GCMGAAAAT ATATTTTTAA 5640 

MCTTTCRTT TTTCCCCCAQ TGAATCATTT AGAATITTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTQTC TAAAATOCAG TaOGGTTTQT TTTGCAArGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAQAaATT.AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGO GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTOTCQACA ICATTAATAT ATATTGTAAT GTTGGGAAGA GATCaCTATT 6180 

rrOAAOCACA GCTTTACAGA TOAQTATCTA TOATACATAT GTATAATAAA TTTTGATCX3G 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAQTAT TCCATGAATA GTACACTGAC 6300 

ACAQGGGTTT TACTTTGAGQ ACCAGTGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAO 6360 

CAGGCAATAT TGCaGTCTTG ATTCTGCCaC TTACAGGATA GATAATGCCT QAACTTTAAT 6420 

GACAAGATQA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTOAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CGTTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA £540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCKAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCARGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACOGOATA CATTTCACGT GTCCTTCAGT ATTUATTTGG TTGAATATTG GGTCATAATG 6720 

GTTOAOAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT QAATCCTGGA TCIGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCXTTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAOAACA CTGCCTCCAC ATAGTAAAAG AATTATAAGT GTGAGQTAGT TGG TAAA AIT 6960 

ATGTAaTTOG ATATACTACX: GAACAATATC TAATCTCTTT TTAGGGAAAT AAAOirtGTG 7020 
CATATATATA ATCCCGRAAC ATG 

Seq ID NO: 527 Protein sequence 
Protein Accession ft: NP_077741.1 

1 11 21 31 41 51 

KAAAGFRRSV RaAVCI<KI.I.I< TLVIFSROGE ACKKVIUNVP SKLEAOKIIG RVHLEECFRS 60 

ADLIRSSDPD FRVI.NDaSVY TAHAVALSDK XRSFnHI.SO KRXQTQXBVT VU.&HQKKVa 120 

KTSHTRETVI. RKAKRRWAPI PCSMQEHSLG PFPLFIiQaVE SDAAQNYTVF YSIS6R6VDK 180 

EPtiNLFyiER DTCNUCTRP VDREEVDVPB LIAYASTADG YSADIiPLFIiP IRVEDENDNH 240 

PVFTEAIYNP EVLESSHPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TCVITTVSHY LDSEWDKYS tlMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 360 

EAFVEEKAFH VEILRIPIGD KDLIKTANVfR VNFTIIiKGKE NGHFKISTDK BTNEGVLSW 420 

KPLNYEENSO VHLEIGWOIB APFAHDIPRV TALNRALVTV HVHDU)EGPB CTPAAQYVHI 480 

KENLAVGSKl NGYKAYOFEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 540 

XHEIiYNITVI. AXOKIX3RSCT GTLAVHIEDV NDNPPEILQB YWICKPKMG YTDIIAVDPD 600 

EPVEOAPFYF SUHTSPBIS RLNSLTKyND TAARLSXQIOI AOFQEYTIPI TVKDSAGQAA 660 

TKUiRVNIiCE CTHPTQCBAT SRSTGVILQK WAIIAILLGI ALIiPSVUjTL VOBVFGATXG 720 

KRFPHJIAQQ NLIISNTBAP OJDRVCSAIK} FMTQTTNNS8 QGFCOTMGSO MKNGGQBTIB 780 
MMKGGNQTLE SCRGAGHHRT LDSCSGGHTE VDNCRYTYSE WHSFTQFRL6 EESIRGHTQ 

Seq ID NO; 528 DMA sequence 

Muclelc Acid Accession #: NM_001941.2 

Godlng sequences 64,. 2754 

1 11 21 31 41 51 

i I I I I I 

GGCAGGTCTC GCTCTOQGCA OCCTCCX33GC GCCCGOSTTC TCCTGGCCCT GCCXX3GCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGGGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGQ 240 

TCTGCAGACC TCATCOGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAOCCA GGGCTGTTGC GCTGTCTGAT AAGRAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGOA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAQGTA 420 

TOSAAGACAA GACACACTnO AQAAACIGTT CTCAQGOGTG CC3UVGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATQCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAftGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGrr TTATATAGAA AGAGACACTQ GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTrTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CXrXCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAQ TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGQACACAAT GCATACQCGC 900 

CICAAATACA GCATTTTGCA GCftGACACCa AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 
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AGCACAGGCG TAATCACCAC 
TCATTCA.TAA TGAAAGTACA 
ACTTGTATCA TAACAGTAAC 
TATGAAGCAT TT6TAGAGGA 
GATAAGGATT TAATTAACAC 
OAAAATGQAC ATTTCAAAAT 
GTAAAGCCAC TGAATTATGA 
GAAGCGCCAT TTGCTAGAGA 
GTTCATOTGA GGOATCTGOA 
ATTAAAGAAA ACTTAGCAGT 
AATAGAAATG GCAATGGTTT 
ATTQATOAAA TTTCAOaGTC 
CCC3WAAATG AGTTGTATAA 
ACTGGAACAC TTGCTGTGAA 
GAATATGTAG TCATTTGCAA 
GATGAACCTG TCCATGGAGC 
AGTAGACTGT GGAGCCTCAC 
AATGCTGGAT TTCAAGAATA 
GCAACAAAAT TATTGAQAQT 
ACTTCAAGQA GIACAGGAQT 
ATA6CACTGC TCTTTTCTGT 
GGGRAAOGTT TTCCTGAAGA 
CCTGGAaACQ ATAGAGTGTG 
AGCCAAGGTT TTTGTGGTAC 
GAAATGATGA AAGGAGGAAA 
ACCCTGGACT CCTQCAGGGQ 
GAGTGGCACA GTTTTACTCA 
GAAGACCGCA TGCCATCCCA 
CCA6CT88TT CTGTGGGCTG 
AAXAATTTOG AACCCAAATT 
AOTGCTACAA TTAGGTCTTT 
TTCAATTTCA ACATGTATQT 
CCAATTTATA TTTTTAAAGC 
AACAGACAAC TGGTAAATCT 
TCTTTTTTTT TTTTACGGAT 
ATAGCTAAGT TATGCTAATA 
ATAAACAAGA AATATTGAGT 
ACTGAATTAA ATTAAAAATG 
ACCAAATTCA TTTGACTTTG 
TCTATAGGAA TATAGTTGGA 
ATTTAAAATG AAATGAGAAC 
TASIXTGICC TACAATAGAA 
ATTATAACTG AGTCTATGAG 
GTAAATAAAT TAAACTTTTC 
TAGCTTTGCT TTGCAGTCTG 
OAATACTCGC TGCAGCTGGG 
TTTTTTTCGG GGAGCTJUVTA 
CTCTATTGCT GTTTCTATTC 
TAACCATGTC CTCCTAGAOT 
GCaCCCTGGG GAGATTGATT 
GTCrGGQAGC TACAAAATTT 
CTGAATCAAG SAAAGCCAGG 
ACCTCCAGCR GA OATTC CCT 
AATTTTTAAT CAGTTTGCTT 
TTGAATGTAT AAAAGAAAAA 
AAGCAGCCCA AGTAGGTTAT 
GGGCAAGGAG AGGCCACAAG 
CTTTTTCCTA CGCTTGGCAC 
GTCOGGTGAG GGATCAGCCA 
3 AGCTGACraC 



PCT/US02/12476 



AAATGCATTC 
TGCCAATTGG 
CAGCACAGAC 
AGAAAACCGT 
TATTCXX»GA 
TGAGGGQCCT 



AAGGTACAAA 
AATCATAACr 
TATTACAGTC 
CATTGAAGAT 
ACCAAAAATG 
TCCATTTTAT 
CAAAGTTAAT 
TACCATTCCT 
TAATCTGTGT 
AATACTTGGA 
ATTGCTAACT 
TTTAGCACAG 
CTCTGCCAAT 
TATGGOATCA 
CCAGACCTTG 
AGGACACACO 
ACCCOOTCTC 
AGATTATGTC 
CTGCAGTGAA 
TATTACATTA 
GTCAGACATT 
ATATGATGAT 
CAGTTGTTQC 
CAAACTCCAQ 
ATTTTAGTAA 
TCAGATTATT 
ATCACTATGT 
TTGCAGCTCA 



GAAATAGTTC 

TTTCAAGATT 
QTTCCCTGCT 
ACAAAAACAT 
TCTCTTATAG 
TTAGAGGCTA 
GTCCTTAAAC 

carrrrrcTC 



gctccx3atct 
gcctcctgag 
tttaatagag 
atcx:gcctgc 
CTTomroc 

TQAn».TAOG 
GGGAGAAAGA 
TTGCTGAAAT 
ACTGTGTTTT 
CTAGTGCCGA 
TAACCATCTC 
TTTGTAATTC 
CATATGTAGT 
CAAGAAAATA 
T6TAAATATA 
GGGGTTTGTT 
TQCTTTTAAA 
, TACAGATGTG 
TAGAGATTAA 
AATASAAATA 
ATTATCAAAT 
GAAGCAC3U3C 
ATTAAAAGTA 



CTGCTCACTG 
TAGCTGGGAC 
AOSGGGTTTC 
CTCGGCCTCC 
GTTTAAAGTC 
AATTGGATCA 
ACTCAGGGCA 
TTCCTGCTGT 
GCTC ACTC CC 
TAAACTTTCT 
TTTGTTCTTT 



OATCAAGTTO 
TTGTACAGTC 
GAATATGGCT 
TGCCTTTTCC 
ACCTCTTCTC 
ATGATGABTC 

TOAATTAAAT 
CCCTAAAATC 
AGACGGA6TC 
AAAGCTCCGC 
TACAGGOSCC 
ACTGTQTTAG 
CAAAGTGCTG 



ATCTTQAAAT 



ATTATTATTT 
TATTTTTAAA 
CAGAATGTTT 
TTGCAATGTT 
GAAACITGGC 
GGGAGATGTA 
ATAATTCTAA 
CTCAATTATO 
TGTCGACATC 
TTTACAGATG 
TTAGAAGGTC 



TCACTCACOG. 
CAAAGAGCAA 
GAACATGCTG 
AATGAAAATT 
CCTTATATGT 
GCTTTCATTT 
TTTCTTACTT 
TTAAACaGAG 
TQCTTAAAAT 
ATAAAACAAT 
GATGATCACT 
TCTTTGTTOT 
ATTAATATAT 
AGTATCTATG 
GTTATAATTG 



OATAATGCSVC CCACTTTCAG ACAAAATGCT 
AATOTGQAAA TCTTACQAAT ACCTATAGAA 
AGAQTCAATT TTACCATTTT AAAGGGAAAT 
AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 
CAAGTGAACC TGGAAATTGG AGTAAACAAT 
GTGACAGCCT TGAACAGAGC CTTGGTTACA 
GAATGCACTC CTGCAGCCCA ATATGTGCGG 
ATCAAOGGCT ATAAGGCATA TGACCCCGAA 
AAATTGCATG ATCCTAAAGG TTGGATCACC 
TCCAAAATCC TGGATAGGGA GGTTGAAACT 
CTGGCAATAG ACAAAGATGA TAGATCATGT 
GTAAATGATA ATCCACCAGA AATACTTCAA 
GGQTATACCG ACATTTTAGC TGTTGATCCT 
TTCAGTTTGC CCAATACTTC TCCAGAAATC 
GATACAGCTG CCCGTCTTTC ATATCAGAAA 
ATTACTGTAA AAGACAGGGC CGGCCAAGCT 
GAATGTACTC ATCCAACTCA GTGTCGTGCG 
AAATGGGCAA TCCTTGCAAT ATTACTGGGT 
TTAGTATGTG GAGTTTTTGQ TGCAACTAAA 
CAAAACTTAA TTATATCAAA CACAGAAGCA 
GGATTTATGA CCCAAACTAC CAACAACTCT 
GGAATOAAAA ATGQAGGGCA GGAAACCATT 
GAATCCTGCC GGGGGGCTGG GCATCATCAT 
GAGGTGGACA ACTGCAGATA CACTTACTCO 
GGTGAAAAAT .TGCATCGATG TAATCAGAAT 
CTCACTTATA ACTATGAGGG AAGAGGATCT 
AA6CAGGAAQ AAGATGGCCT TGACTTTTTA 
GCAGAAGCAT QCACAAAGAG ATAATGTCAC 
CTGGAGGTTT CCAAAAATAA TATTGTAAAG 
TTTTTTCTCA ATTTTGAATT ATGCTACTCA 
TTATCTTTTC CAAAAAGTOA AAAATOTTAA 
CACTGGAATT AAGQTCTCTA AAGCATCTQC 
TAAAIAT6CT GGATAAATAT TAGTCCAACA 
ATGTATTCAC TTTAAGrGAT AGTTTAAAAA 
6AAQAAAGTT TTGGAAAAGA AACAATGAAG 
TAAAOAATTQ GGACTCACCC CTACTGCACT 
GTGTTGAAGT GCCCTATGAA GTMCAATTT 
TGTGTGTATA TTATTATTAA TCAATGCAAT 
ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 
GCTTCCTAGG CCTGG6CTCT TAAATGCTGC 
CTGTCCAATT TGTGTAATTT OTTTAAAATT 
GGGAAGQAAA TASOOAATCC AATGGAACAG 
TCTGCATCCA CAAGTTAGTA GCAAACTGGG 
TTTTGGTAGC AAGGGTCCAG A6ATGAGGTG 
TTTAAAACTT ACCTTTACTG AAGTTAAATC 
TGACCAACAT CTTTTTAATT TAGATCCAAA 
GAGGGAGCTG AGGGGAGGAT CTTACTGAAA 
CTAAGCXrCCA CAAACTTGAC ACCTGATCAG 
CTCACTGCCC TTCTTCTGAG TCGCATTG6C 
OCCCTTCTTT OOGCTTTCTG CTAAAGCAAC 
CAGGTTTTCC ACCATCCTTC AQCOTGAATT 
AATTTTAAAA TAATAGAAGA AATAGAAATT 
TCaTTTTAGA ACAGAGQGAA CTTTGGGAGA 
AGAGGGCAAC AGGAAGATGC AGGCCTTCAA 
GGGAGTAAAA GCAACATCGT CTGCTTCATA 
TTTCTCAGGC CAATGGCAAC TGCCATTTGA 
TATGGCTCAC CTTATTTGGA QTGAQAAATC 
TGAAGGCATT TGCAGGATGA GCCTGAACTG 
AATTGTTeTA TTCCTTCTGC AGCCCTCCTT 
GCCTATCTAA AATTCTGATT TATTCCTACA 
TATGTGTTTT AGACTTAGAC TTTTTATTGC 
TOOCTCTGAC GCACaWSGCTG GAGTGCAGTG 
CTCCCGGGTT CATGCCATTC TCCTGCCTCA 
CACX»CCACG CCCGGCTAAT TTTnGTATT 
CCAGGATGGT CTCGATCTCC TGACCTOGTG 
GGATTACAGG CATGACCCAC CGCTCCCGQC 
AATGTAATCA TTTTGAACAT GTGTGAAAGT 
ACTCAACCAA AAGACAGTCXS AGAAGCCAGG 
GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 
CAGTTTTATC TAACS3GCTAC TGAAACACCC 
ATCAAAACCT GCTACCTCXIC CAAGACTTTA 
CCASTATCAC TTCCdGTCX ATAAAAOCIC 
AAAACCACCT GGTCTGCA1C! TAT6CCC8AA 
TAATTTTAGO OATTCaTTTC TATATTTTCA 
GTAAGGTGAA ATTTATG6TA TTTGAGTaTG 
TTCCCCCAGT GAATGATTTA GAATTTTTTA 
TTATAAGGAA GCAGCTGTCT AAAATGCAGT 
TTTTAGTATT GCTATTAAAA GAAGTTACTT 
AAGCAAAAAT TGGATGCATA AAGTAATATT 
ATTAACTTGG TTTCTTGTTT TTGCTGTATT 
TTGCAAAATT ATGCTTATGG CTGGCATCGA 
ATTAATGGQO AATATTTTGQ ACAAT GTTTC 
ATTGTAATQT TGGGAAGAGA TCACTATTTT 
ATACATATGT AXAATAAATT TTGATCGOCST 
CAGAGIATTC CATOAATAQT ACACTGACAC 
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AGGGGTTTTA CTTTCAGGAC CAGTGTAGTC AAGGGAAAAC ATGAGTTAAA AAGAAAAGCA 6240 

GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA ACTTTAATGA 6300 

CAAGATGATC CAACCATAAA GGTGCTCTGT GCTTCACAGT GAATCTTTTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAACXj TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTOTATGTCT TCAAGAATQT TCATTOGATT TTTGTTTGTA ATAGTAAAAT 6S40 

ACOGGATACA TTTCaVOGTCT CCTTCJVQTAT TGATTTGGTT GAATATTGOa TCATAATGGT 6600 

TGAGAAGCRT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGQATC TGTCACTTAC 6660 

TTCTGTGTGA CCTTTGAAAG OCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

GAACAATGCC AflCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 6780 

ATAOAACACT GCCIGCACAT AGTAAAA6AA TTATAAQTGT GAGGTA6TTG GTAAAATTAT 6840 

GTAOTTGOAT ATACTACCGA ACAATATCTA ATCTCTTrTT AGGGAAATAA AaTTTGIOCA 6900 
TATATATAAT CCOBAAACAT G 

Seq ID MO: 529 Protein sequence 
Protein Accession #i NP_001932.1 

21 31 41 SI 

I I I I 

TLVIFSHDCE ACKKVILNVP SKLEADKIIG RVNLEECPRS 60 

TAKAVALSDK KRSFTIWLSD KBKQTQKEVT VLLEHQKKVS 120 

PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 180 

VDREEYDVFD LIAYASTADQ YSADLPLPLP IRVEDENDNH 240 

TVGWCATDR DBPDTMKTRIi KYSIIiQOTPR SPGUSVHPS 300 

LIMKVQDMDG QFEGLIGTST CIITVTOSND MAPTFRQNAY 360 

KDIiINTAMHR VNFTILKGNB NGHPKISTDK ETNB3VLSW 420 

APFAROIPRV TAUIRAIiVTV HVRDLDEGPE CTPAAQYVRI 480 

RNGNOLKYKK UtDPKGWITI DEISGSIITS KILDREVETP ' 540 

GTLAVHIEDV NDNPPEIIjQE YWICKPKMG YTDIIAVDPD 600 

RLWSI.TKVND TAARLSYQKH AGPQBYTIPI TVKDRAGQAA 660 

SRSTGVILGK WAILAILU5I AliPSVIiLTL VCGVFGATKG 720 

GDORVCSANG FKIQTTNNSS QGFCGTMGSG MKHGGgETIB 780 

LDSaiaaHTB VDHCRYTYSB WHSFTQPRWJ EKLHRCHQME 840 
AGSVGCCSEK QEEOGLDFLN NUEPKFITLA BACTKR 



MAAAGPRRSV RGAVCLHIjLIi 

adlirssdpd frvlndgsvy 
ktrhtretvl rrakrrwapi 
bpuhiFyier dtoilfctrp 
pvfteaiynf evlbssrpgt 
tgvittvshy ldrewdkys 
eafvebhafm veilripieo 

ECPWYEENRQ VMLBIGVmiE 
KENIAVGSKI l^aYKAYDPE^f 
BNELYMITVL AIDKDDRSCT 
EPVHGAPFVF SLPNTSPBIS 
TKLI«VNLCE CTHPTQCRAT 
KRFPEOLAQQ NJjIISNTBAP 
HMKGGNQTI>B SCRGAGHHHT 
DRMPSQDYVIi TYNYEGRQSP 



Seq ID NO: 530 DNA sequence 

nucleic Acid Accession ft: NM_016583.2 

Coding sequence : 72 . . 842 

1 11 21 31 41 51 

I I I I I I 

GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA GTCX»GATAC 
TAAGAGCAAA GATGTTTCAA ACTQGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 
CCATGGCCXav GTTTGGAGGC CTGCCCGTGC CCCTGQACCA GACCCTGCCC TTGAATGTGA 
ATCCAGCCCT GCCCTTGAGT CCCACAGGTC TTGCAGGAAG CTTGACAAAT GCCCTCAGCA 
ATGGCCTQCT GTCIGGGGGC CIQTTGGGCA TTCTGGAAAA CCTTCOGCTC CTGGACATCC 
TGAAGCCTGO AGOAGGTACT TCTGGTG6CC TOCTTGGGGG ACTGCTTOeA AAAOTO ACBT 
CAGTGATTCC TGGCCTGAAC AACATCATTG ACRTAAAGGT CACTGACCCC CAGCTOCTGG 
AACTTGGCCT TGTGCAGAQC CCTOATGGCC ACOQTCTCTA TGTCACCATC CCTCTOGGCA 
TAAAGCTCCA AGTGAATACG CCCCTGGTCG GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 
TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAGAGQATC CACCTGGTCC 
TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 
CCCTCCCCAT TCAAGGTCTT CTGGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCCTG 
AGrroOTTCA GGGCAACGTO TGCCCTCTGO TCAATGAGGT TCTCAGAGGC TTGGACATCA 
CCCTGGTGCA TGRCATTGTT AACATGCTGA TCCAOGGACT ACRGTTTGTC ATCAAfiGTCT 
AAGCCTTCCA GGAAGGGGCT GGCCTCTGCT GAGCTGCTTC CCAGIGCTCA CAGATGGCTG 
GCCCATGTGC TGOAAGATGA CACASTTGOC CTCTCT C OSA OQAACCT6CC CCCTCTCCTT 
TCCCACCAGG OBTGT G TAAC ATCCCATGTG CX:TCACCTAA TAAAATOGCT CTTCTTCTGC 



1 H 21 31 41 51 

I I I I I I 

MFQTGGLIVF YGLLAQIMAQ FGGIiPVPIiDQ TLFLNVNFAIi PIiSPTGIAGS LTNALSNGLL 
SGGLUSIIftl IiPLLOILKPG GGISGQLLGG UiGKVTSVIP GLHKIIDIKV TDPQU>EU}I. 
VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKUIIT ABIIAVRDXQ ERIiOiVLGDC 
THSFQSLQIS LUXJLOPLPI QGIiLDSLTGI IiNKVLPELVQ GNVCPIiVNEV LRGIiDITLVK 
DIVNMUHQL QFVIKV 



seq ID NO: 532 DMA sequence 

nucleic Acid Accession 8: NM_004363.1 

Coding sequence: lis.. 2223 

1 11 21 31 41 51 

I I i I I I 

CTCAGGGCAG AGGGAGGAAG GACAGCAGAC CAGACAGTCA CAQCRGCCTT GACAAAACGT 60 
TCCTGGAACT CAAGCTCTTC TCCACAGAGG AGGACAGAGC A6ACAGCAGA GACCATGGAO 120 
TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCTGGC AGAGGCTCCT GCTCACAGCC 180 
TCACTTCCAA CCrrCTGGAA OCCSaOCCACC ACTGCCAAGC TCACTATTGA ATCCACGCCG 240 
TTCAAOXSTCG CAOAGGGBAA GGAGGTGCrT CTACTTGTCC ACAATCTGCC CCAGCATCTT 300 
moaCTACA OCTGQTACAA AGGTOAAAGA QTGGATGGCA ACOGTCAAAT TATAGGATAT 360 
OtAATAGGAA CTCAACAAGC TACCCCAGGG OCOGCATACA GTGGTCGAGA GATAATATAC 420 
CCCAATGCAT CCCTGCTGAT OCAGAACATC ATCCAGAATO ACACAGGATT CTACACCCTA 480 
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CACGTCATAA AQTCAGATCT 
QAGCreCCCA AGCCCTCCAT 
GTGGCCTTCA CCTGTGAACC 
CACAGCCTCC OSGTCAGTCC 
TTCAATGTCA CAAGAAATGA 
GCCAGGCGCA GTQATTCAGT 
TCCCCTCTAA ACACATCTTA 
TCTAACCCAC CT6CACAGTA 
GAGCTCTTTA TCCCCAACAT 
AACTOMSACA CTGGCCTCAA 
CCCRAACCCT TCATCACCAG 
TTAACCTGTG AACCTGAGAT 
CTCCX»3TCA GTCCCAGGCT 
GTCACAAGGA ATGATGTAGG 
CACAGCGACC CAGTCATCCT 
TCATACACCT ATTACCGTCC 
CCACCTGCAC AOTATTCTTG 
TTTATCTCCA ACATCACTGA 
GCX3WHX3GCC ACAGCAGGAC 
CCCTCCATCT CCAGCAACAA 
TGTGAACCTG AGGCTCAGAA 
GTCAGTOCCA GGCTGCAGCT 
AGAAATQACX3 CAAGAGCCTA 
QACCCAGTCA CCCTGGATGT 
TCGTCTTACC TTTCX3GGAGC 
CCGCAGTATT CTTGGCGTAT 
GCCAAAATCA CGCCAAATAA 
GGCCGCAATA ATTCCATAGT 
CTCTCAGCTG GGGCCACTGT 
TAGCAGCCCT QGTGTAGTTT 
TAAAGCATTT GCAACAGCTA 
AGACTCTGAC CAGAGATCGA 
AAATACAAAA ATGAGCTGGO 
TGAGGCAGGA GAATCGCTTG 
ACTGCACTCC AGTCTGGCAA 
TCTQACCTGT ACTCTTQAAT 
AACTTTAATG AACTAACTGA 
TAATTAATTT CATGGGACTA 
TTCCCAGATT TCAGGAAACT 
AAATATACTT TTGTGAACAA 
AGACTTGGGA AACTATTCAT 
TCAATAAAAA TCTGCTCTTT 



CAGGCTGCAG 
CACAGCAAGC 
CATCCTQAAT 
CAGATCAGGG 
CTCTTGGTTT 
CACTGTGAAT 
TAGGACCACA 
CAACAACTCC 
TCAGAACACA 
GCAGCTOTCC 
ACCCTATGAG 
GAATGTCCTC 
AOqpGTGAAC 
GCTGATTGAT 
GAAGAACAGC 
TACAGTCAAG 
CTCCAAACCC 
CACAACCTAC 
GTCCAATGGC 
TGTATGTGOA 
CCTCTA1GG6 
GAACCTC3U^C 
CAATGGGATA 
TAACGGGACC 
CAAGAGCATC 
CGGCATCATG 
CTTCATTTCA 
CAGTCTAAAA 
GACC3VTCCTA 



6AAGCAACTG 
AACTCCAAAC 
GACX3CAACCT 
CTGTCCAATG 
TACAAATGTG 
GTCCTCTATQ 
GAAAATCTGA 
GTCaU^TGGGA 
AATAGTGGAT 
GTCACGACGA 
AACCCCX3TGG 
ACCTACCTGT 



GCCAGTTCCG 
CCGTGGAGGA 
ACCTGTGGTG 
GCAACAGGAC 



TATGGCCCAG 
CTCAGCCTCT 
GGGAACATCC 
GGACTCTATA 
ACAATCACAG 



CTGTGGTGGG 
AACAGGACCC 
ATCCAGAACT 
CGGGACACGC 
CTCTCCTGCC 



tatgoctgtt 
acagtctctg 
attggagtgc 

GGAAGACTGA 
TGCTTCT 



GCCGGGATGC 
acctctcctg 
CTTTCCAGCA 
CCTATAOQTG 
TCACAGTCTA 
AGGATGAGGA 
GGTGGGTAAA 
GGACCCTCAC 
AGAAOGAATT 
ACGACCCCAC 
CCTGCCATGC 
AGCAACACAC 
CCTGCCAGGC 
TCTCTGCGGA 
AGGATQCTGT 
TAAATGGTCA 
TCACTCTATT 
CAGTGAOTGC 
CCATCATTTC 
RCXCXK5CCTC 
ACACACAAGT 
TTGTCTCTAA 
CATCTGGAAC 



GGTATACCCG 
CAAGGATGCT 
GGTAAACAAT 
CCTCACTCTA 
CCCAGTGAGT 



PCT/US02/12476 



CCACGCA6CC 
ATCCACCCAA 
CCAAGCCCAT 
TGCAGAGCCA 
TGCTGTAGCC 
TAATCAGAGC 
TCTACTCAGT 
AAGTGTTGAC 
CATTTCCCCC 
AGCCTCTAAC 



GCCAACATC6 1 



CAATAACrCA 
GCTGCCCAAG 
GGCCTTCACC 
GAGCCTCCCA 
CAATGTCACA 
AAACC3GCAGT 
CCCC<XAGAC 
TAACCCATCC 
TCTCTTTATC 
CTTGGCTACT 
TTCTCCTGGT 
TGCTCTGATA 
GCTTCTTCCT 
TTTACAGAAA 
TCTCTACTAA 
CnSGGGAGGC 
A6ATCGCACC 




ACAAcrrrrcT 

CAGCTTCATO 
AATGAACTAA 

AAATT6AGAC 
OAATATTTAT 
GTATAACAOA 



1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
16S0 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
aS20 
3S80 
2640 
2700 



TGAGGATTGC 
TAAGCTATCC 
ATTTACATTT 
ATTGTATGGT 
AAAA 



ACTCTTACAG C 
TCTCCCTATG 1 
AATATAGTTA T 



41 



WCIPWQRLLL 
ERVDGMRQII 
HEEATOQFRV 



TASLLTFWNP 
GWIGTQQAT 
yPELPKFSIS 



QSLPVSPRLQ 
SPSYTYlfRPG 
MSASGHSRTT 
LPWSPRLQLS 
PDSSYLSGAN 



TTVTTITVYA 
I.SNDIilRTLTIi 
VNLSL5CKAA 
VKTITVSAEL 
N(2«lTLTIiFN 
UILSCSSASN 
SITVSASGTS 



AASHFPAQYS 
BPPKPFITSN 
LSVTRNDVGP 
SNFPAQYSWI. 



VTRNDARAYV 
PSPQYSWRIN 
PGIiSAOATVQ 



YECGIQMELS 
lOGKIQQHTQ 
KPVEDKDAVA 



I 

! TPFNVAEGKE 
: lYPNASLLIQ 
DAVAFTCEPE 
VSARRSOSVI 
TQEIiPIPNlT 
VALTCEPEIQ 
VDHSDPVILN 
EliFISNITEK 
FTCEPBAQIIT 
RSDPVnjDVIi 
PIAKITFHNM 
^ LI 



51 
I 

" VLLLVHNLPQ 
NIIQNDTGPY 
TQBATYLWWV 
IjNVIiYGPDAP 
VNNSGSYTCQ 
NTTYLMWVNN 
VLYGPDDPTI 
HSGIiyTCQAK 



TYIiHHVliGQS 540 
YGPOTFIliBP 600 



Seq ID HO: 534 DNA sequence 

HUclelc Aeid Accession «i 1IM_0069S2.1 

Coding sequencei 11.. 793 



41 



AATCCCQACA A 



ATCTGACCAA CACRGCCTCT 
ATOQGCATAT 
ATGAAGTCCA 
rCTGAAGTGG 
TTCCTQAAGC 
TGGAAAAACA 
GGCGTAAATG 
GATGCTQACT 



TGTAGGCATC 
AGTATATGCC 
ACCCAACCTC 
TGATGACCAG 
CAATTGCTGT 
TGAGAATAAT 
AGAACCTCTC 



CTGA-r CTCTG 
TAAGAA 



ACAACTCAAC 
GTTGCX3GCAT 
ACCCACTGCT 
TTGTGGGCAT 
GCAGGAAAAT 
CATCTTGTAT 
AGATGCTAGA 
ATGGAGTCa^C 
GTCCATCAGA 
ATCCCTGGCC 
CTTGTAAACT 
GTCCAATGAA 



TGTTCGTTGC 
TGCCCTGACT 
TGAAGCCACC 
ctgcctcttc 
TCTTCTGGCG 
CACAGCAGCA 
GAGGTACCAA 
CAAAACXTTGO 
CTGGCAAAAA 
TOJTCASTQC 
AGGCGTGCCT 
CCGACACGCC 



I I 
TTCCAGGGCC TGCTGATTTT 
GCGQAGTGCA TCTTCTTTGT 
GACAACGATG ACATCTATGG 
TGCCTGTCTG TTCTAGGCAT 
TATTTCATTC TGATGTTTAT 
ACACAACGAG ACTTTTTCAC 
AACAACAGCC CTCCRAACM 



TACRCKTCTG CCTTCCGGAC 540 

TSTGTTATOA ACAATCTTAA 600 

GaTTTTTATC ACAATCAOGG 660 

TGQGGGCrrrG CCTGGTTTGG 720 

ACCATGTTCT ACTGGAGCAQ 780 



390 



10 



20 



WO 02/086443 

MAKDNSTVRC FQGDIjI FGNV IIGCCGIAIiT AECIFFVSDQ RSIiYPIiLEAT IWODIYGAAH 
IGIFVGICLP CLSVU3IVGI MKSSRKIUA YFII«FIVYA FEVASCITAA TQRDFFTPHIi 
FLKQMLERYQ NNSPPNOTDQ WKNNGVTKTW DRLMLQDNCC GVNGFSDHQK YTSAFRTENK 
DADYPWPRQC CVMNNLKEPI. NLEACKLGVP GFYHNQGCYE IiISOPMNHHA WGVAWraFAI 
LCWTFWVLLG TMFYWSRIEY 

Seq ID NO.- 536 DNA sequence 

Nucleic Acid Acceasion #: NM_002638.1 

Coding sequence: 120.. 473 



PCT/US02/12476 



I 

CAATACAGCT 
GCTGGACTGC 
TGAGGGCCAG 
AGGCAGCXGT 
TCAATGGACA 
OSCAAGAGCC 



TCAAGAAGTG 
CGGTCCTTGC 
TGCTGCCCTT 
GAGCTGCCTC 



AAG6AATTAT 
ATAAAGATTQ 
CAGCTTCTTQ 
CACGGGAGTT 
AGATCXrCGTT 
AGTCAAAGGT 
CATGTTGAAT 
CTGTGAAGGC 
TGCACCTGTG 
CCCCTTCCCA 
TCTCATCCAC 



CCTGTTAAAG 
AAAGGACAAO 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGCXSGaA 
CCGTCCCCAG 
CACTGTCCAT 
TTTCCAATAA 



TACCACAOAC 
AGCTCTTAGC 
TGTTCCTCAT 



TTTCAGITAA 
CTAAGCCTGG 
GCTGCTTGAA 
TGGCCTGTTT 
AGCTACAGGC 
TCTTCCTCCC 



41 51 
I I 
COSCCCTGGA GCCAGSCCAA 
CAAACACCTT OCTGACACCA 
CGCTGGOACQ CTGQTTCTAG 
T6TCAAAGGC COIGTTCXAT 
AGGICAAGAT AAAGTCAAAG 
CTCCTGCCXX: ATTATCTTQA 
AGATACTQAC TGCCX»GQAA 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75. 
80 
85 



Seq ID NO: 538 DNA aequeuce 
HUcLeic Acid Accession ftt NM_001793.2 
. .2560 



I 1 1 I ■) I 
MRASSFLIW VFLIAGTLVL EAAVTQVPVK GQDTVKGHVP FMGQDPVKGQ VSVXGQDKVX 
3 MACFVPQ 



TCACCCCTCT 
TCCAGGTTTG 
CTGAAGTGAC 
TATTCATGGG 
CTGTGCGGAA 
AGATCTTCCC 
TATCTOTCCC 
ATAAAGATAG 
CTGAGGGTGT 
ACCGGGAGGA 
CAGTGGAGGA 
AGTTTACCCA 
TGATGCAGGT 
CTTACTCCAT 




ATCCAAAOGT 
TGAAAATGGC 
AGACACCAAG 
CTTCGCTGTA 
GATTGCCAAG 



ATCTTACX3AA 
AAGGGTCCCT 
ATTTTCTACA 
GAGAAGGAGA 



CTCIGTTTAO 
OAAGGTCACT 
GACACAAGAG 

TCCCXKSiGAG 



GGACACCTTC C 



3 TCTTAGAGGO 



CCATAGCCAA 
CACCATCAGC 
CATCCAGGCC 
GATCCTTGAT 
GCCTGAGAAT 
CMCTCACCA 
TACCATCACC 



GTCATCTCCA 



GCCAATGACA 
GCAGTGGGCC 
GCGTGGCGTO 
ACCCACCCTQ 



ACCCACACX3A 
QTGGCCTGGA 
ATGGGGACGO 
ATGCTCCCAT 
ATGAGGTGCA 



OCCTGTGTGT 
CATCCTGftGA 
TGTGGGCACC 
GGTCTTGGCC 
ACTGATTGAT 
CCAAAGCCCT 



GTCTACACTO 



GGAAGOTGAC 
GCACCTTTCT 
GTGCGACTGC 
CCCTGJGCTG 
GAQAAAGAAQ 
CGTCTTCTAC 
GCTCCaCOGA 
CATCATCCCQ 
TATAATTGAG 
CTTGGTGTTC 
CTCCGCCTCC 
GAAGCTGGCA 
GGGACCAAAC 
GACTTCGGAG 
AajTTAOAGT 
AGCACMGAAA 
TCTTACCTGC 
TACAGTGGAC 



CTCGACCGTG 
ATGGACAATG 
GTCAATGACC 
GTGCGCCAGG 
GCCCAGCTCA 
ACAGTGGTCT 
CTGTCTGACC 
CATGGCCATG 

CGGAAGATCA 
TATGGCGAAG 
GGTCTGGAGG 
ACACCCATGT 
AACCTGAAGG 
GACTATGAGG 
GACCAAGACC 
GACATGTACG 



CCAAAGTOGT 
CAOAAGACCC 
GGTGGCTAQC 
AGGATGAGCA 
GAAGCCCTCC 



AATCCATTGA 
GTTGCTCCAA 
CTCAAGTCTA 
GACAGCCCCC 
GTTGTTGAAT AAGCCACTGG 
TGTGTCAGAG AATGGTGCCT 
CCAGAATGAC CACAAGCCCA 
AGTCCTACCA GGTACTTCTG 
CRCCTACAAT GGGSTGGTTG 
CCTCATOTTC ACCATTCACC 
GTCCCTCRGT 
ACGGCAGTGG 
CAGAAGTACG 
GTCACTQATC 

TATCATGGGC 
GGGCATCXntJ 
OGTTGAAGTG 
AGTGQTCCAC 



ACAACCAGGA 
ACCAAOGAGG 
GTGGAGGATG 



ACACACTGAC 
CACTAGTGGA 
AGGCCCATGT 
TGGACGCCCC 
GGGACCATTT 



TGACAAGGAG 



TGCTGAACAT 
CAGATGACTC 
TGTCCCTGAA 
ATGGCAACAA 
TCGAAACCTQ 
TGGCTCTGCT 
AGGAGCXXXT 
AGGGGGOTGG 
CCAGGCCGGA 
ACCS3TCCTCG 
CGGCTAACAC 



GTTTGTGAGG 
CACCACTGGC 
CCCTGAGCCC 



AGACATCTAC 
GAAGTTCCTG 
AGAGCAGCTO 
CCCTGGACCC 
GTTCCTCCTG 
CXTTACSCCCA 



AATCAAAAGA 
GACAOTGGGC 
AACAACATCT 
ACGGGAACCC 
CGTCAGATCA 
GACCTGTCTC 
TaOACGGCAG 
AAGCAGGATA 
ACGGTGATCA 
TGGAAGGGAG 
CTGGTGCTGC 
GAAGATOACA 



CCCCTTTTGT 
TGAATGAGGC 
CXACTGGGGA 
TCAGCTACCXS 
AGGTCACa^GC 
ATGAA6TCAT 
TTCTGCTAAC 
CCATCTOeSUV 
CCCACACCTC 
AGGTCAACQA 
CATATGACGT 
GGGCCACTGT 
GTTTCATCCT 



TTTGn 



CTTGTCAGGA 
GGTTGCTTCC 
ACCTCTCCAC 
CXSTAAAATGC 
TTTCTCTCTG 



AAGATTAC6A 
GTGGOGGGGA 
AGAGCATCTC 
AGTGGCCOTA 
TTAGCCTTTC 
CTGGGCCAfiO 
TCAAOCCTGT 
6AATGGAA0C 



GCCAGOCAAC CCIU3AT6AAA TCGGCAACTT 
AGACCCCACA GCCXXX3CCCT ACQACACCCT 
CXSVOGCOGOG XCCCTGAGCT CCCTCACCTC 
TTATCTGAAC OAOrGGGGCA GCCGCTTCAA 
GGACGACIAG GCGGCCTGCC TGCAGGGCTG 
CAAGGGBTCT CASTTCCCCC TTCAGCTGAG 



\ ACTTAATTTT 



1020 
1080 
1140 

12€a 
1320 
1380 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 

2460 
2520 
2580 
2640 
2700 
2760 
2B20 
2880 
2940 



391 



wo 02/086443 

TTTTTTTAAT QCTATCTTCA AAACGTTAQA OAAAQTCCTT CMAAGTGCA GCCCAGAGCT 3000 

QCTGGGCXX» CTGGCCGTCC TGCATTTCK5 GTTTCCftGAC CCCAATGCCT CCCATTOGGA 3060 

TOGATCTCia CGTTTTTATA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT T ATTTTCCC T 3120 

eTTGCGTTGC TATAGATGAA GGGTGAGGAC AATaJTQTAT ATGTACTAGA ACTTTTTTRT 3180 
TAAAGAAACT TTTCCCAGAA AAAAA 



1 • 11 21 31 41 SI 

I 1 I i I ) 

MGLPRGPIiAS LLLliOVCWLQ CAASBPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 
QBPAI.FSTDN DDFTVRNGET VQERRSLKER NPUCIFPSKR ILRRHKRDWV VAPISVPEWG 
KOPFPQRUJQ LKSMKDRDTK IPYSITGPGA DSPPEGVFAV EKETQWLLLN KPLDHEEIAK 
YELPGHAVSE MGASVEDPMN ISIIVTDQND HKPKFTQDTP RGSVLEOTIiP OTSVMQVTAT 
DEDDAIYTYN GWAYSIHSQ EPKDPHDIJ4P TIHRSTGTIS VISSGLDREK VPBYTI.TIQA 
TDMDGDGSTT TAVAWEILD ANDMAPMFDP QKYBAHVPEN AVGHEVQRLT VTDLDAPMSP 
AWRATYLIMQ GDDGDHFTIT THPESNQGIb TTRKGLDFEA KNQHTLYVEV TKEAPFVIjKI. 
PTSTATIWH VEDVHEAFVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKB NQKISYRIIiR 
DPAiGWIin»a>P DSGQWTAVOT U3REDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTLID 
VNDHGPVPBP KQITIC3IQSP VRQVUJITDK DLSPHTSPFO AQLTDDSDIY WTAEVNEEGD 
TWLSLKKPtj KQDTYDVULS LSDHGNKEQL TVIRATVC33C HGHVETCPGP WKGGPILPVIi 
GAVIALIiFIiIi LVIiLLLVRKK RKIKBPLIiLP EDDTRDNVFY YGEEGGGBED QDVDITQLHR 
GDEARPBWL RNDVAPTIIP TPMYRPRPAN PDEIfflWIlB NUCAANTDPT APPYDTU.VF 
OYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWOSRFKXIA DMYGGQEDD 

. Seq ID NO: 540 DHA sequence , , . . 
Nucleic Acid Accession #.• Eos sequence 
I.. 672 



111)11 
ATGAGGCTCC AAAQACCCCXJ ACAGOCCOCG GCGGOTOGGA GGCGOGOGCC CCGGGaOGGG 
CGGGGCTCCC CCTACCGGCC AGACCCX3GGG AGAGGOGCGC GGAGGCTGCG AAGGTTCCAG 
AAGGGCGGGG AGOaaGOGCC GOGCQCTGAC CCTCOCTGQG CACCiGCIIGGO GACGATGGCO 
CTGCTCGCCT TGCTaCTGGT OQTGGCCCTA CCQCQG6TGT GGACAGACGC CAACCTGACT 
GCGAGACAAC GAGATCCAGA GGACTCCCAG CGAACGGACG AGGGTGACAA TAGAGTGTGG 
TGTCAT6TTT GTGAGAGAGA AAACACTTTC GAGTGCCAGA ACCCAAGGAG GTGCAAATGG 
ACAGAGCCAT ACTGCGTTAT AGCGGCCOTG AAAATATTTC CACGTTTTTT CATGGTTGCG 
AAGCAGTGCr CCGCTGGTTG TGCAGCGATG GAGAGACCCA AGCCAGAGGA GAAGCGGTTT 
CTCCTGGAAG AGCXX»TGCC CTTCTTTTAC CTCAAGTGTr GTAAAATTCG CTACTGCAAT 
TTAGAGGGGC CACCTATCAA CTCATCAGTG TTCAAAGAAT ATGCTQGGAG CATGGGTGAG 
AGCTGTGGTG GQCTGTGGCT GQCCATCCIC CTGCTGCTCQ CCTCCATTGC AGCCX3GCCTC 
AGCCTGTCTT OA 



1 11 21 31 41 SI 

ilRLQRPHQAP icGRRAPRGG RGSPYRPDPG SGARRLRRFQ KGOBGAPRAD PPWAPLGTMA 60 

LLALLLWAL PRVHTDANLT ARQRDPEDSQ RTDEOXniVW CHVCEHBSTP BCQMPHRCKII 120 

TEPYCVIAAV XIFPRFFMVA KQCSAGCAAH ERPKPEBKRF LLBEPMPFFY I^CCKIRYCN 180 
LBQFPnrSSV FKBYAGSKOB SOGGLHLAII. LLLASIAAGL SLS 

seq ID KOt 542 DHA sequence 

Nucleic Acid Accession #t XM_03Sa92.2 

coding sequence: 53.. 1576 

1 11 21 31 41 51 

aCTOSCrGGG CCXSCGOCTCC CGGGTGTCCK AGGCCCGGCC GOTGCGCAGA GCATGGCGGG SO 

TGCGGGCCCQ AAGOGQCGCG CGCTAGCGGC GCtXMCGGCC GAGGAGAAGG AAOAGGCGCG 120 

GGAGAAGATQ CTGGCCGCCA AGAGCX3CGGA CGGCTCGGOG CCOaCAGaOO AGOGCGAGGG 180 

CX3TGACCCTG CAGCGGAACA TCACGCTGCT CAAOGGCXJIG OCOVTCATCXS TGGGGACCAT 240 

TATCGGCTCG GGCATCTTCG TGACGCCCAC GGGCQTGCTC AAGGAGOCAG GCTOGCXXSGG 300 

GCTGGCGCTG GTGGTGTGGG CCGCGTGOGG CJOTCTTCTOC ATCOIGGGCG OGCTCTGCTA 360 

CGCGGAGCTC QGCACCACCA TCTCCAAATC GGGCGGCXJAC TAOGCCTACA TGCTOGAGGT 420 

CTAOGGCTCG CIGCCOGCCT TCXTCAAQCT CTGGATCGAO CTGCTCATCA TCOGGCCTTC 480 

ATCGCAGTAC ATOGTGGCCC TGGTCTTOGC CACCTACCTG CTCAAGCCGC TCTTCCCCaC 540 

CTGCCCGGTG CCCGAGGAG6 CAGCCAAGCT CGTGGCCTGC CTCTGOGTGC TGCTGCTCAC 600 

GQCCGTGAAC TQCIACAGCG TQAAGGCCGC CACCCGGGTC CAGOATGCCT TTGCCGCCGC 660 

CAAGCTCCIG GCCCTGGCCC TGATCATCCT GCTGGGCTTC GTCCAGATCO GAAAGGGTGA 720 

TOTOTCCAAT CTAQATCCCA ACTTCTCATT TCAAGGCACC AAACTGGATG TGGGGAACAT 780 

TGTGCTGGCA TTATACAGCXJ GCCTCTTTGC CTATGGAGGA TGQAATTACT TQAATTTOir 840 

CACAGAGGAA ATGATCAACC CCTACAGAAA CXTTGCCCCTQ GCCATCATCA TCTOCCTGCC 900 

C»TCGTGACa CTGGTGTACG TGCTGACCAA OCTGGCCTAC TTCACCACCC TGTCCACCGA 960 

GCAGATGCTG TCGTCCOAGG COSTGGCOGT GGACTTOGGG AACTATCACC TGGGCGTCAT 1020 

GTCCTG6ATC ATCCCCX3TCT TCGTGGGCCT GTCCTGCTTC GGCTCCGTCA ATGGGTCCCT 1080 

GTTCACATCC TCCAGGCTCT TCTTCX3TGaa OTCCOGGGAA GGCCACCTGC CXTTCCATCCT 1140 

CrCCATGATC CACCCACAGC TCXnX»CCCC OGTGCC G TCC CTCGTGTTCA CGTGTGTGAT 1200 

GACGCIGCTC TACGCCTTCT CCAAGGACAT CTTCTCCSSTC ATCAACTTCT TCAGCTTCTT 1260 

CAACTGGCTC TGCX3TGGCCC TGGCX3VTCAT CGGCATOATC TGGCTGOGCC ACAGAAAGCC 1320 

TGAGCTTGAG CGGCCCATCA AGGIGAACCT GGCCCTGCCT GTGTTCTrCA TCXTTGaCCTQ 1380 

CCTCTTCCTG ATCGCOGTCT CCTTCrGGAA GACACC00I6 GA6TOIG6CA TOGGCTTCAC 1440 

CATCATCCTC AGCGGGCTQC CC3GTCTACTT CTTCOGGGTC TQGIGQAAAA ACAAOCCCAA 1500 
GTGGCTCCTC CAGOGCATCT TCTCCACGAC OBTCCTQTGT CASAAOCTCA TGC" 



1560 



392 



20 



60 



80 



WO 02/086443 PCT/US02/12476 

CCCX;CAGGAa ACATAGCCAG GAGGCCGA8T GGCTGC»GGA GXAGCATGC 



1 11 21 31 41 51 

I I I I I I 

MAQAQPKRIiA IiAAPAAEGKB EAREKMLAAK SADGSAFAGE GEGVTIjQRNI TLLNGVAIIV 
GTIIGSGIFV TPTGVIiKEAG SPGLALWWA ACGVFSIVQA LCYAEI/STTI SKSGGDYAYM 
LEVYGSLPAP LKIiWIBLLII RPSSQYIVAL VFATYLUCPli PPTCPVPEEA AKLVACIiCVL 
IiIiTAVMCYSV KAATKVQDAF AAAKLLALAL IILI/3FVQIG KGDVSNLDPN FSPEGTKIjDV 
GNIVLALYSG LFAYGGWMYI. NFVTEEMINP YSNLPLAIII SLPIVTLVYV LTNLAYFTTL 
STEQMLSSEA VAVDFOIYHIi GVMSWUPVF VQI.SCFGSVN GSLFTSSRLF FVGSREGHLP 
SILSMIHFQIi iTPVPSLVFT CVMTLI.YAFS KDIFSVINFF SFFNWLCVAL AIIGMIWIjRM 
RKPEIiERPIK VNLAIiPVFFI LACIiFLIAVS PWKTPVBCGI GFTIIIiSGLP VYFFGVWWKN 
KPKHLLQGIP STTVLCQKLM QWPQBT 

Seq ID NO: 544 DMA sequence 

Nucleic Acid Accession f»: NM_0OS268.1 

Coding sequence: 168. .969 



1 11 21 31 41 SI 

I I t I I I 

_ TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACAOQGGCTT CCCCOAAAAC CTTCX:CCGCT 60 

25 TCTGGATATO AAATTCAAGC TGCTTGCTOA QTCCTATTGC CG6CTGCTGG GAGCCAGGAG 120 

AGCCCTGAGG AGTAGTCACT CAGTAGCAGC TeAOGCGTGG GTCCACCATG AACTGGAGTA 180 

TCTTTGAGGQ ACTCCTGAGT GGGGTCAACA AGTACTCXac AGCCTTTCGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGCGTGCTGQ TGTACCTGGT GACJQGCCOAG CGTGTGTGGA 300 

GTGATGACCA CAAGGACTTC GACTGCAATA CTCGCXAGCC CGGCTGCTCC AAOSTCTGCT 360 

30 TTGATOACrrT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCXmSGTGA 420 

CATGCCCCTC ACTGCTCX5TG GTCATGCACG TGGCCTACCG GGAGGTTCRG GAGAAQAGGC 480 

ACCXSAGAAGC CXaTGGQOAQ AACflGTGGGC GCCTCTACCT QAACCCCX3GC AAOAAGCGGG S40 

OTGGGCTCra GTGGACATAT GTCTGCAGCC TAOTOTTCAA G60QA60GT0 GACATCGCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATAXATCCT CCCTCCTGTG 6ICAAGTGCC 660 

35 ACX3CAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAOAAGAACA 720 

TTTTCACXXT CTTCATGOTG GCCACAGCTO CCATCTGCAT CCTGCTCAAC CTCGTGGAGC 790 

TCATCTACCT GGTGAGCAAG AGATGCCACG AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TOTGCACAGa TCATCACCCC CAOSGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CXMGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCX: 960 

40 GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAGGTTGGG 1020 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTGGG GGAGCTAAGC 1090 

CATGAGGTAG GGGCAGGCAA GAGAGAGOAT TCAOACGCTC TGGGAGCCA6 TTCCTAGTCC 1140 

TCAACTOCAa CCACCTGCCC CAGCTOSACG GCACTGGGCC ACTTCCCCXTT CTGCTCTGCA 1200 
GCTCGGTTTC CTTTTCTAOA ATGGAAATAG T 

45 



1 11 21 31 41 SI 

50 I I I 1 I 1 

MNHSIFEGLL SGVHKYSTAF GRIWLSLVFI FRVr.VYI.VTA ERVWSDDHKD FDCNTRQPGC 
SKVCFDEFFP VSHVRUIAUl LIWCCPSU. WHHVAYBEV QEKRHREAHG ENSGRLYLNP 
GKKROGLMHT YVCSLVPKAS VDIAFLYVPH SFYPKYILPP WKCHADPCP HIVDCFISKP 
SEKHIFTLPM VATAAICILL NlaVEtilYIiVS KRCBBCLAAR KAQAMCTGKK PKGSTSSCKQ 
55 DDLLSGSIilP LGSDSRPPLL PDRPROKVKK TIL 



Seq ID NO: 546 DMA sequence 

Nucleic Acid Accession it: NM_002391.1 

Coding sequence : 26 . . 457 



11 21 31 41 SI 

I I I I I t 

CGGGCGAAGC AGCGCOGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 
CGCCCTGCTG GCGCTCACCT COGCGGTCGC CAAAAAGAAA GATAAGOTOA AGAAGGQCGG 
65 CCOGGGGAGC GAGTGOBCTQ AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 
CGGCXrrGGOT TTCOQOQAflG aCACCTQCXXJ GQCCCAGACC CAG03CATCC GGTGCAGGGT 
GCCCTGCAAC TGGAAGAAGG AOTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTOGGQ 
TGCGTGTGAT GGCGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 
CAATGCTCAQ TCCCAGQAGA CCATCCGCGT CACCAAGCOC TGCACCCCCA AGACCAAAGC 
70 AAAGGCCAAA GOCAAGAAAG GOAAGGGAAA GGACTA6ACG CCAAGCCTGG ATGCCAAGGA 
GCOCCIGGTa TCACATGGGG CCTGGCCAOG CCCTCCCTCT CCCAGGCCC6 AGATGTGACC 

CAccngraoc ttctotctgc tcgttagctt taatcaatca tgccctgcct tqtccctctc 
Aciccccrxrc cccaoxcta agtgoccaaa gtgqggaogs acaagggatt ctgggaagct 

__ , TQAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 
75 AtTACTAAGA AAChCATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 
TAATAT 

Seq ID NOi S47 Protein sequence 
Protein Accession #s NP_002382.l 

1 11 21 31 41 51 

I I I 1 I i 

MQHROFIJjIiT LIiAIiIiALTSA VAKXKDKVKK GGPGSECAEN AHGPCTPSSK OCGVGFRE6T 
CQAOTQRIRC RVPCKWKKEF GADCKYKFEN W6ACDGGTGT KVRQGTLKKA RYNAQCQETI 
RVTKPCTPKT KAKAKAKKQK GKD 

Seq ID NOi 548 UNA sequence 
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Huclelc Acid Accession ft: »JM_0O6'783 . 1 
Coding sequence: 1..786 

. 1 11 21 31 41 51 

I I f I I I 

ATGaATTGGG GOACQCTGCA CACTTTCATC GGGGGTGTCA ACAAACACTC CACCAGCATC 
GGGAAGGTGT GGATCACAGT CATCTTTATT TTCCGAGTCS^ TGATCCTAGT GGTGGCTQCC 
CAGGAAGTGT GGGGTGACGA GCAAGAGGAC TTCGTCTGCA ACACACTGCA ACCGGGATGC 
AAAAATGTGT GCTATGACCA CTTTTTCCCG GTGTCCCACA TCXGGCTGTG GGCCCTCCAG 
CTGATCTTCG TCTCCACCCC AGOGCTGCTG GTGGCCATGC ATGTGGCCTA CTACAGGCAC 
GAAACCACTC GCAAOTTCAG GCGAGGAQAa AAGAGGAAIG ATTTCAAAGA CATAGAGGAC 
ATTAAAAAGC ACMGGTTCa aATAGAGOQa TCQCTGTGGT GGACGTACAC CAGCAGCATC 
TTTTTCCXSAA TCATCTTTGA AGC3«3CCTTT ATGTATGTOT TTTACTTCCT TTACAATGGG 
TACCaCCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCCCCAACCT TGTTGACTGC 
TTTATTTCTA GGCCAACAGA GAAGACCGTG TTTACCATTT TTATGATTTC TQCGTCTGTG 
ATTTGCATGC TGCTTAACGT GGCAGAGTTG TGCTACCTGC TGCTGAAAGT GTGTTTTAQG 
AGATCAAAGA GAGCACAGAC GCAAAAAAAT CACCCCAATC ATGCCCTAAA GGAGAGTAAG 
CAGAATGAAA TGAATGAGCT GATTTCAGAT AGTGGTCAAA ATGCAATCaC AGGTTTCCCa 
. AOCTAA 



1 11 21 31 41 SI 

I I I I I I 

MDWGTLHTFI GGVNKHSTSI GKVWITVIFI FBVMILWAA QEVWGDEQED FVCMTIflPGC 
KNVCVDHFFP VSHIHLWALQ. LIFVSTPALL VAMHVAYYRH ETTRKFRRGE KHMDFKDIED 
IKXHKVRIEG SLHWTYTSSI FFRIIFEAAF MXVFVPLYNG YHLPWVLKCG IDPCPNLVDC 
FISRPTEKTV PTIPMISASV ICMLUIVAEL CYliIjIiKVCFR RSKRAQTQKH HFHHALKBSK 



Seq ID NO I 550 DNA sequence 

Nucleic Acid Accession S: MM_002S71.1 

Coding sequence ■ 99 . . 587 

1 11 21 31 41 51 

I I I I I I 

CATCCCTCTG GCTCCAGAGC TCAGAGCCAC CCACAOCCGC AGCCATGCTG TGCCTCCTGC 
TCACCCTGGG CGTGGCCCTG GTCTGTGGTG TCCCGGCCAT GGACATCCCC CAGACCAAGC 
AGGACCTGGA GCTCCCAAAG TTGGCAGGGA CCTGGCACTC CATGGCCATG GCGACCAACA 
ACATCTCCCT CATGGCGACA CTGAAGGCCC CTCTGAGGGT CCACATCACC TCACTGTtGC 
CCACCCCCGA GGACAACCTG GAOATCOTTC TGCACAGATQ OOASAACAAC AGCTGTGTTG 
AQAAOAAGGT CCTTGBAGAO AAQACTGGQA ATCCAAA6AA GTTCA AGATC AACTATAOGG 
TGCOQAACOA GGCCACQCTG CTCQATACTG ACTACGACAA TTTCCTGTTT CTCTGCCTAC 
AGGACACCAC CAOCCCCATC CAGAGCATGA TGTGCCAGTA CCTGGCCAOA GTCCTGGTGG 
AGGACGATGA GATCATGCAG GGATTCATCA GGGCTTTCAG GCCCCTGCCC AGGCACCTAT 
GGTACTTGCT GGACTTGAAA CAGATGGAAG AGCCGTGCCG TTTCTAGCTC ACCTCCGCCT 
CCAGGAAGAC CAGACTCCCA CCCTTCCACA GCTCCAGAGC AGTGGGACTT CCTCCTGCCC 
TTTCAAAGAA TAACCACAGC TCAGAAGACG ATGACGTGGT CATCTGTGTC GCCATCCCCT 
TCCTGCTGCA CACCTGCACC ATTGCCATOQ GGAGGCTaCT CCCTGGGGGC AGAGTCTCTG 
GCAGAGGTTA TTAATAAACC CTTGGAGCAT G 



I I I I I t 

MDrPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLEIVLHR 

WENNSCVEKK VW3EKTGMPK KFKINYTVAN EATLIiDTDYD KFIiFLCIiQDT TTPIQSMMCQ 
YIARVLVEDD BIKKIGFIBAF RPLPRHLWYL LDIiHQMEEPC RF 



Seq ID NO I 552 DI 
Nucleic Acid Accession »: _ 
Coding sequence: 27.. 1967 

1 11 21 31 41 51 

I I I I I 1 

ACTTGCGTCT CGCCCTCCOG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 
TOSOCSCCTO CiaCTGCrST CCTOSCGTOG OGGGTGTGCC CGGAGAGGCT GAGCAQCCTG 
CGCCTOAGCT GGTGGAGQTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 
AGTCCCAAGG CAACCTCAGC CATGTCQACT GGTTTTCTGT CCACAAGGAG AAGCG6ACGC 
TCATCTTCCG TOXOOGCCAO GOCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 
TCAGCCTCCA GGACAGAGGQ GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 
GCATCTTCTT GTQCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTOCOOB 
TCTACAAAGC TCOGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 
GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 
TCATCTGGTA CAAQAATGGC CGGCCTCTGA AGGAGGAOAA GAACCGGGTC CACATTCAGT 
OBTCOCAaAC TGTGGAGTOG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACASC 
TGOTTAAAGA AGACAAAGAT GCCCAGTrTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 
GOAACCACAT QAAOGAQTCC AOGQAAaTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 
TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 
GTTTGGCTOA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 
GGGAGGCAGA GGAAGAGACA ACCAAOQACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 
AGQAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT CGACACCATO ATATOGCTQC 
TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTO AGTCCCOCAG 
• CCCCTGAOAQ ACAGGAAGGC AGCAGCCTCA CCCTGAOCTO TGAGGCAQAG AQTAGCCAGQ 
ACCTCQAGTT CCRGTG6CTG AGAGAA6AGA CAGACCACGT GCTGOAAACX: GOGCCTGTGC 
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TTCAGTTGCA TC3ACCTGAAA CX3GGAGGCAG GAGt3CGGCTA TCGCTGOGTG GCXTTCTGTGC 1260 

CCAGCATACC CSGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATCGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCXC CGGCCCACCSV TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACGA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC ISOO 

TGTTGGAGAC AGGTaTTGAA TGC3«3aGCCT CCRACQACCT GG6CAAAAAC ACX3M5CATCC 1560 

TCTTCCTGOA GCTOGTaUVT TTAACOVCCC TCAC3UXAGA CTCC3VACACA ACCACTGGCC 1620 

TCAGCACTTC C3VCTGCCRGI CCTCATACCR GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCX5G GGCGTGGTCR TCGTGOCTGT QATTGTGTGC ATCXTTGOTCC 1740 

TGGCGGTGCr GGGCGCTGTC CTCTATTTCC TCTATAAGRA GGGCARGCTG CCGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCXC CX3TCTCX3TAA GACCX»ACTT GTAGTTGAAG 18 SO 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCXTTGCA GGGCAGCAGC GGTGACAAGR 1920 

GGGCrCCGGG AGACCAOGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CRGCTCXOTT CCCTQCCTGO ACC3VTTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGQ QACTAQAGAG AAGCCTCCTG CTCCCCTCAC CIGCAC AOCC CCTTTCAGAG 2100 

GGCCRCTGOa TTAGGACCTO AGGACCTCAC TTGGOCCTGC AAGCCGCm TC3W3GGRCCA 2160 

GTCCRCCACC ATCTCCTCCA CGTTaAGTOA AGCTCATCCC AAGCAAGOAG CXTOCnGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTT6CAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCX»G CAGCTOAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACXa^T CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTQTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCX:TGTG 2S20 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2S80 

GGCCAGGT6T GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCX3A GGCGGGCGGA 2640 

TCACAAAGTC AKJGACGnSAC CATCCIGGCr AACAOSeiGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTA6 GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTOAAOCAOa AOAATGGTAT GAATCCAGGA GGTGGAGCTT OCRGTGAGCC GAGACCGTGC 2820 

• CACTGCACTC caGCXrrGGGC AACACAGCGA GSCTCCGTCT OGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TtCRGGTGAA TTAGCCTCAA 2940 

TCCCCXrrGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACSW ACXWAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AQAATGGTAC TTAfiGGATGG AAAAOGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CieiGTGTAT 6CATACATA1 GT G TGTATAT ATGOTTTTiST CAGGTGTGTA AATTTGCAAA 3240 

TTOTTTCCTT TATATATGTA TQTATATATA TATATOAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TOTCCCAOAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGOGGQ CCTGTGAAAC TACAACXAAA AGGCACACAA AACOGTTTCC AGTTGGCAGC 3420 

AOAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCa AGCTCTACCA GAGCAOACAG 3480 

CTACCCTACr TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAftOGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCX3TCCA CTT 



1 11 21 31 41 51 

I I I I I I 

GI.PRLVCAFL LAACCCCPRV AGVPGEAEQP APELVEVEVG STALLKCGM QSQGHMHVD 
WFSVHKEKKT l.IFa\fRQGQG QSBPGEYEQR LSUJDRGATL ALTdVTFQDB KIFLCQQKRP 
RSQEYRIQI.R VYKAPBEPNI QVNPLGIPVH 8KEPEEVATC VGRMOYPIPQ VIWYKNGRPIj 
KEEKNRVHIQ SSQTVESSGI. YTLQSILKAQ LVKEDKDAQF YCELNYRLPS GNHMKESRBV 
TVPVFYPTEK VWLEVBPVGM UCEGDRVEIR CLADGNPPPH FSISKQNPST RBAEEETTND 
NGVLVLBPAR KEHSGRYECQ AWNLDTMISL LSBPQELLVN YVSDVRVSPA APERQEGSSI. 
TLTCEAESSQ DLEFQWLHEE TDQV1.ERGPV LQUDJIiKREA GGGYRCVASV PSIPGLNRTQ 
liVKLAIFGPP WMAPKERKVW VKENMV1*IIjS CEASGHPRPT ISMNVNGTAS EQDQDPQRVL 
STLNVLVTPE LI.BTGVECTA SNDLGKNTSI LFI.BIjVNLTT MPDSNTTTG LSTSTASPHT 
RANSTSTBRK LPEPESRGW IVAVIVCIIiV LAVLGAVtiYF LYKKGKLPOl RSGKQBITLP 
PSEUCTEIiWB VKSDKliPEEM GIitiQGSSQDK RAFGDQGEKY IDLRH 

Seq 10 NO I 554 DHA sequence 

Nucleic Acid Accession «i NM_0031B3.3 

Coding sequence I 165.. 263 9 

1 11 21 31 41 51 

TaORGOCTGG CX3GTAGAATC TTCCCAGTAG GCGGOGOGGO AGGGAAAAGA GGATTGAGGG 
GCTAGSCCGG aOGGATCCCQ TCCTCCCOCQ ATQTQAGCAG TTTTCCQAAA CCCC6TC3U3Q 
CXSAAQOCTCC CCMSAGAOGT GGBOTCGaXA GOaaaaCOQG GAACATOAGG CAGTCTCTCC 
TATTCCTGAC CAGOGTGGTT CCTTTOQTGC TGGCX3CXX3CG ACCTCOGGAT GACCCXSGGCT 
TCGGCCCCCA CCAGAGACTC GAGAAGCTTG ATTCTTTGCT CTCAGACTAC GATATTCTCT 
CTTTATCTAA TATCCAGCAG CATTCGGTAA GAAAAAGAGA TCTACAGACT TCAACACATG 
TAORAACACT ACTAACTTTT TCAGCTTTGA AAAGGCATTT TAAATTATAC CTGACaVTCaUV 
GTACTGAACG' TTTTTCACAA AATTTCAAGG TCGTGGTGGT GGATGGTAAA AACGAAAGCG 
AGTACACTGC AAAATGGCAG GACTTCTTCA CTGGACAOGT GGTTGGTGAG CCTGACTCTA 
GGQTTCTAOC CCAC^TAAOA GAXQATCATG TTATAATC3U3 AATCAACACA GATGGGGCCX3 
AATATAACAT AGMSCCACTT TGGAOATTTO TTAATGATAC CAAAGACAAA AGAATGTTAG 
TTTATAAATC TGAAGATATC AAGAATGTTT CACBrTTOCA GTCTCCAAAA GTGTGTGGTT 
ATTTAAAWST GGATAATGAA GAOTTGCTCC CAAAAOaOTT AOTAGACAGA GAACCACXTTG 
AAGAGCTTGT TCATCGAGTG AAAAGAAGAG CTGACCCAGA TCCCATGAAQ AACACQTQTA 
AATTATTGGT GGTAGCAGAT CATCGCTTCT ACAGATACAT GOGCAGAGGG GAAGAGAGTA 
CAACTACAAA TTACTTAATA GAGCTAATTG ACAGAGTTGA TGACATCTAT CGQAACACTT 
CATGGGATAA TGCAGGTTTT AAAGGCTATG GAATACAGAT AGAGCAQATT CGCATTCTCA 
AGTCTCCACA AOAQOTAAAA CCTGGTGAAA AGCACTACAA CATGGCAAAA AGTTACCCAA 
ATGAAGAAAA GGATOCTTGG GATGTGAAGA TGTTGCTAQA GCftATTTAGC TTTGATATAG 
CTOAOOAAGC ATCTAAAGTT TBCrTGGCAC ACCTTTTCAC ATACCRAGAT TTTGATATGG 
GAACTCTTGa ATTAGCTTAT GTTaGCTCTC OCAGAGCAAA CAGCCATGGA GGTGTTTGTC 
CAAAGGCTTA TTATAGCCCA OTTGQGAAGA AAAATATCTA TTTGAATAGT GGTTTGACGA 
GC3VCAAAGAA TTATOGTAAA ACCATCCTTA CAAAGGAAGC TGACCTGGTT ACAACTCATG 
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AATTGGGACR TAATTTTGGA GCaGAACATG ATCCGtSATGG TCTAGCAGAA TGTGCCCCQA 1440 

ATGAGGACCSV GGGAGGGAAA 7ATGTCATGT ATCCX31TAGC TGT6AGTCGC GATCAC6AGA 1500 

ACAATAAOAT GTTTrCAAAC TCCAGTAAAC AATCAATCTA TAAGACCATT GAAAGTAAGG 1560 

CCCAGCACTC3 TTTTCAAGAA CGC3W5CAATA AAGTTTGTGG GAACTCOAGa GTGGATOAAG 1620 

GAGAAGAGTG TGATCCTGGC ATCATGTATC TGAAC3UVCGA CACCTGCTGC AACAGCGACT 1680 

GCACGTTGAA GGAAGGTGTC CAGTGCAGTG ACAGGAACAG TCCTTGCTGT AAAAACTGTC 1740 

ACTTTGAQAC TGCCCA(3AAG AAGTGCCAGG AGGCGATTAA TGCTACTTGC AAAGGCGTGT 1800 

CCTACTGCAC AGGTAATAGC AGTGAGTGCC CGCCTCCAGG AAATGCTGAA AATGACACTG 1860 

TTTGcrraGA tcttggcaag tgtaaggatg ggaaatgcat ccctttctgc gaoaqggaac 1920 

AOCAGCTQQA GTCCTGTGCSl TGTAATGAAA CTGACRACTC CTGCAAGGTG TGCTGCAGGG 1980 

ACCTTTCTOG OTQCTGrGTa CCCTATGTCG ATGCTGAACA AAAGAACTTA TTTTTGAGGA 2040 

AAGOAAAGCC CT6TACAGTA GGATTTTGTG ACATGAATGG CAAATGTGAG AAACGftGTAC 2100 

AGGATGTAAT TGAAOGATTT TGGGATTTCA TTGACCAGCT GAGCATCAAT ACTTTrGGAA 2160 

AGTTTTTAGC AGACAACATC GTTGGGTCTG TCCTGGTTTT CTCCTTGATA TTTTGGATTC 2220 

CTTTCAGCAT TCTTGTCCAT TGTGTGGATA AGAAATTGGA TAAACAGTAT GAATCTCTGT 2280 

CTCTGTTTCA CCXXXGTAAC GTCGAAATGC TGAGCAGCAT GGATTCTOCA TOGGTTOQCA 2340 

TTATCAAACC CTTTCCTGCG CCCCAGACTC CSVGGCCGCCT GCAGCCTGCC CCTGTGATCC 2400 

CTTCGGCGCC AGCAGCTCCA AAACTGGACC ACXA6AGAAT GGACACX»TC CAOOAAGACC 2460 

CCAGCACAOA CTCCCATATQ GACGAGGATG GGTTTGAGAA GGACCCCTTC CCAARTflGCA 2520 

QCACAGCTGC CAACTCATTT GAGGATCTCA CX36ACCATCC GGTCX3CCAGA AGTOAAAAGG 2580 

CTGCCTCCTT TAAACTGCAG CGTCRGAATC GTGTTAACAG CAAAGAAACA GAGTGCTAAT 2640 

TTAGTTCTCA GCTCTTCTCA CTTAAGTGTG CAAAATATTT TTATAGATTT GACCTACAAA 2700 

TCAATCACAG CTTGTATTTT GTGAAGACTG GGAAGTGACT TAGCAGATGC TGGTCATGTG 2760 

TTTGAACTTC CTGCAGGTAA ACAGTTCTTG TGTGGTTTGG CCCT TCTCCT TTTGAAAAOG 2820 

TAAGGTGAAA GTGAATCTAC TTATTTTGAG GCTTTCAGQT TTTAGTTTTT AAAA TATCTT 2880 

TTGACCTGTG GTGCAAAAGC AGAAAATACA GCTGGATTGG GTTATGAATA TTTAOGTTTT 2940 

T6TAAATTAA TCTTTTATAT TGATAACAGC ACTGACTAGG GAAATGATCA OTTTTTTTTT 3000 

ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA GAAAAGTGGA 3060 

ATAGTTTTTT TTTTTTTTTT TTTTTTTTGC CTTCAACTAA AAACAAAGGA GATAAATTTA 3120 

GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT TTTTATGTAG 3180 

C3M3GGAAAAT ATATATCTAA ATTTAOAAAT CATTTGGGTT AATATGGCTC TTCATAATTC 3240 

TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA TGGTAGCCAQ 3300 

TTGAATTTAT GGAATCTACC AACTGTTTAO GGCCCTOATT TGCTOGaCAQ TTTTTCTGTA 3360 

TTTTATAAGT ATCTTCATGT ATCCCTGTrA CT6ATAGGGA TACaTGTCTT AOAAAATTCA 3420 

CTATTCGCTG GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACrTGGAGAO GCTGAGGTia 3480 
OGCCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq ID MOi 555 Protein sequence 
Protein Accession 8: NP_003174.2 

1 11 21 31 41 SI 

I I I I I I 

HItQSLIiFLTS WPFVUVPRP PDDPGFGPHQ RI<BKU3SLLS DYDILSIiSiri QQHSVKXRDL 60 

QTSTHVBTLI. TFSALXKHFK LYLTSSTERP SQNPKyWVD GKHBSEYTAK HODFFTBHW 120 

GBFDSRVIAH IRODOVIIRI NTDGAEYNIB PUtSFVMDTK OKSMLVYKSE DIKNVSSI4)S 180 

PKVCGYLKVD NEELLPKOLV DREPPBEIiVH RVK8RADPDP MKNICaOiIiW ADHRFYRYMG 240 

RGEESTTTNY I.IELIDRVDD lYRNTSWDNA GFKGyaiQIE QIRILKSPQE VKPGEKHYNK 300 

AKSYPNEEKD AWDVKML1,EQ PSPDIAEEAS KVCLAHLPTY QDFDMGTLGL AYVGSPRANS 360 

HGGVCPKAYY SPVGKKNIYL NSGLTSTKNY GKTILTKBAD LVTTHELGHN FGAEHDPDGL 420 

AECAPNBDQG GKXVMypIAV SGDHENHKMF SHCSKQSIYK TIESKAQECF QERSNKVCGN 480 

SRVDEGEEC33 PGIMYUJNDT CCNSDCTLKE GVQCSDRNSP CCKMCQFETA QKKCQEAIHA 540 

TCKGVSYCTQ HSSBCPPPGK AEHOTVCLDL GKCKDGKCIP FCEREQQLBS CACNETCHSC 600 

ICVCX3lOI.SGIt CVPyVDAEQK MJUIKGKPC TV6FCDMKGR CBXRVQDVIE RFWDFIDQLS 660 

INTEGKFLAD MIVGSVIVFS LIFWIPFSIL VKCVDKKLDK QYESLSLFHP SNVEMLSSMD 720 

SASVRIIKPF PAPQTPGRLQ PAPVIPSAPA APXIiDHQRMD TIQBDPSTDS HMDEDGFBKD 780 
PFPNSSTAAK SFEDLTDHPV ARSEKAASFK LQRQHRVNSK BTEC 

Seq ID NO: 556 DNA sequence 

nucleic Acid Accession S) NM_021832,1 

Coding sequencet 164.. 2248 

1 11 21 31 41 51 

I I I I I I 

7CGAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGOGCGGG ASQAAAAOAG GATTGAGGQG 60 

CTAOGCCGGG CGGATCCCGT CCTCCXTCOGA TGTGAGCAGT TTTCCX3AAAC CCOQTCAGGC 120 

GAAGGCTGCC CAGAGAGGTG GAGTOGGTAG CGGGGCCXK3G AACATGAGGC AOTCTCT CCT 180 

ATTCCTGACX: AGCGTGGTTC CTTTCGTGCT GGCGCX33CGA CCTCCGGATG ACCCGGGCTT 240 

CGGCCCCCAG CAOAGACTCXj AGAAGCTTGA TTCTTTGCTC TCASACTAOG ATATTCTCTC 300 

TTTATCTAAT ATCXXGCAGC ATTCGGTAAG AAAAAGAGAT CTACAOACTT CAACACATGT 360 

ASAAACACTA CTAACTTTTT CAGCTTTGAA AAGGCATTTT AAATTATACC TGACATCAAG 420 

TACTGAACQT TTTTCACAAA ATTTCAAGGT CXSTGGTGGTG GATGGTAAAA ACGAAAGCGA 480 

GTACACTGTA AAATGGCAGG ACTTCTTCAC TGGACACGTG GTTGGTQAQC CTQACTCTAG 540 

GGTTCTAGCX: CACATAAGAQ ATGATCATGT TATAATCAGA ATCAACACAG AT6GGGCOGA 600 

ATATAACATA GAGCCACTTT GaAOATTTCT TAATGATACC AAAGACAAAA GAATSTXAGT 660 

TTATAAATCT GAAGATATCA AOAATGTTTC AOGTTTGCAC TCTCCAAAAfl TOTOTGOTTA 720 

TTTAAAAGTG GATAATGAAG AGTTOCTOCC AAAAGGQTTA OTASACAGAG AACCACCTGA 780 

AGAGCrrGTT CATCGAGTOA AAAGAAQAQC TOACCKAGAT CXZCATGAAGA ACACGT6TAA 840 

ATTATTGGTG GTAGCAGATC ATCGCTTCTA CAGATACATO GGCAGAGGGO AA«3AGAQTAC 900 

AACTACAAAT TACTTAATAG AGCTAATTGA CAGAGTTGAT GACATCTATC GGAAC3VCTTC 960 

ATGGGATAAT GCAGGTTTTA AAGGCTATGG AATACAGATA GAGCAQATTC GCATTCTCAA 1020 

QTCTCCACAA GAGGTAAAAC CTGGTGftAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 1080 

TGAAGAAAAG GATGCTTGGG ATGTGAAGAT GTTGCTAGA6 CAATTTAGCT TTGATATAGC 1140 

TGAGGAAGCA TCTAAAGTTT GCTTGGCACA CCTTTTCACA TAOCAAGATT TTaATATGGQ 1200 

AACTCTTGGA TTAGCTTATG TTGGCTCTCC CAGAGCAAAC AGOCATGGAG GTGTTTGTOC 1260 

AAAGGCTTAT TATAGCCCAG TTGGGAAOAA AAATATCTAT TraAATAGTG GTTT6ACGAG 1320 

CACftAAGAAT TATCGTAAAA CCATCCTTAC AAAGGAAGCT OACCTGGTTA CAACTCATQA 1380 

ATTGGGACAT AATTTTGGAG CAGAACATGA TCCGGATGGT CTAGCAGAAT GTOCCCCGAA 1440 
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TGAGGACCAG GGAGGGAAAT ATGTCATGTA TCCCATAGCT GTOAGTGGOS ATCACGAGAA 
CRATAAGATG TTTTCAAACT GCAGTAAACA ATCAATCTAT AAGACCATTG RAAGTAAGGC 
CCAGGAGTGT TTTCAAQAAC GCAGCAATAA AGTTTGTGGG AACTCS3AGGG TGGATGAAGG 
AGAAGAGTGT GATCCTGGCA TCATGTATCT GAACAACGAC ACCTGCTGCA ACAGCGACTG 
CACGTTGAAG GAAGGTGTCC AGTGCAGTGA CAGGAACAGT CCTTGCTGTA AAAACTQTCA 
GTTTGAGACT GCCCAGAAGA AGTGCCAGGA GGCGATTAAT GCTACTTGCA AAGGCGTGTC 
CTACTGCACA GGTAATAGCA GTGAGTGCCC GCCTCCAGGA AATGCTGAAG ATGACACTGT 
TTGCTTGGAT CTTGGCAAGT GTAAGGATGG GAAATGCATC CCTTTCTGCG AGAGGOAACA 
GCAGCTGGAG TCCTGTGCAT GTAATQAAAC TGACAACTCC TGCAAGGTGT GCTGCRGGGA 
CCTTTCCGGC CGCTGTGTGC CCTATGTCGA TGCTGAACAA AAGAACTTAT TTTTQAGGAA. 
AGGAAAGCCC TGTACAGTAG GATTTTGTGA CATGAATGGC AAATGTGAGA AACX3AGTACA 
OGATGTAATT GAAOGATTTT GGGATTTCAT TOACCAGCTG AGCATCAATA CTTTTGaAAA 
GTTTTTAGCA GACAACATCG TTGGGTCTGT CCTGGTTTTC TCCTTGATAT TTTGQRTTCC 
TTTCAGCATT CTTGTCCATT GTGTGTAACG TCGAAATGCT GAGCAGCATG GATTCTGCAT 
CGGTTCGCAT TATCAAACCC TTTCTTGOGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 
CTGTGATCCX; TTCX3GCGCCA GCAGCTCCSUV. AACTGGACCA CCAGAGAATG GACACCATCC 
AGGAAGACCC CAGCACAGAC TCACATATGG ACGA6GATGG GTTTGAGAAG GACCCCTTCC 
CAAATAGCAQ CACAGCTGCC AAGTCATTTQ AGGATCTCAC GGACCATCCG GTCACCAOAA 
GTGAAAA6GC TGCCTCCTTT AAACTGCAGC GTCAGAATCG TGTTGACAGC AAAGAAACAG 
AC3TGCTAATT TAGTTCTCAG CrCTTCTGAC TTAAGTGTGC AAAATATTTT TATAGATTTG 
ACCTACAATC AATCACaCCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGATGCTG 
GTCATGTGTT TGAACTTCCT QCAGGTAAAC AGTTCTTQTG TGGTTTGGCC CTTCTCCTTT 
TGAAAAGGTA AGGTGAAGGT GAATCTAGCT TATTTTQAGG CTTTCAGGTT TTAQTTTTTA 
AAATATCTTT TGACCTGTGG TGCAAAAGCR OAAAATACAQ CTGGRTTGGG TTATGAGTAT 
TTACGTTTTT GTAAATTAAT CTTTTATATT OATAACAGGC ACTGACTAGG GRAATGATC3V 
GTTTTTTTTT ATACACTGTA ATGAACCQCT GAATATGAAQ CATTTGOCAT TTATTTGTGA 
GAAAAGTGGA ATAGTTTTTT TlTrr iT r r r TTTTTTTTGC CTTCAACTAA AAACAAAGGA 
GATAAATTTA GTATACATTG TATCTAAATT QTGGQTCTAT TTCTAOTTAT TACCCAGAGT 
TTTTATGTAG CaWKSGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC 
TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA 
TGGTAGCCAG TTGAATTTAT GGAATCTACC AACTQTTTAG GGCCCTGATT TGCTGGGCAG 
TTTTTCTGTA TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGOA TACATQTCTT 
AGAAAATTCA CTATTGGCTG GGAGTGGTGG CTCATGCCTQ TAATCCCAGC ACTTGQAGAQ 
3421 GCTGAGGTTG CGCCACTACA CTCCAGCCTG GGTQACAGAO TGAGATCTGC CTC 
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I 

MRQSLbFLTS 
QTSTHVETliL 
GBPDSRVUUI 
PKVCX3YLKVD 
RGEESTTTOY 
AK3YPNEEKD 
HGGVCPKAYY 



TCKGVSYCTG 
K VCCRD LSGR 
INTFGKFLAD 



WPFVIiAPRP 
TFSALKRBFK 
IRODDVIIRI 
HEELLPKGIiV 
IiIEIiIDRVSD 
AWDVKMLUEQ 
SPVGKKMIYI. 
GKYVMYPIAV 
PGIMVUINDT 
HS8ECPPPGN 
CVPYVDAEQK 
NIVOSVI.VFS LIFHIPFSIb \ 



CFKBYGIQIE 
KVCIJUII.PTY 
GKTILTKEAD 
SNCSKQSIYK 
GVQCSDRNSP 
GKCKDGKCIP 



Seq ID NO: S5B I»IA sequence 

Nucleic Acid Accession ft: NM_004994.] 

Coding sequence: 20.. 3143 



1 



CCTGAGAACC 
CACTCGG6TG 
CCAGAAGCAA 
GCGAACCCCA 
OUVGTGGCAC 
GGCGGTGATT 
CACCTTCACT 



TTTGCIGCCC 
AATCTCACCG 
GCAGAGATGC 
CTGTCCCTGC 
CGGTGCGGGG 
CACCACAACA 
GACGAOGCCT 



C GGGTATCCCT 



CATCTTCGAO 

CTGGTGCAGT 
GAGACTCTAC 
CCAAGGCCAA 
03CCACCACC 
CTCGACGGTG 
GGGTAAGGAG 



GGCCGCTCCT 
ACCA0G6CCA 
ACXXX3GGACX3 
TCCTACTCCG 
GCCAACTACG 



TGAGCCTCTG 
CCAGACAGCX; 
ACAGGCAGCT 
CIGGAGAGTC 
CCGAGACCGG 
TCCCAGACXrr 
TCACCTATTG 
TTGCCCGCGC 
GCCGGGACX3C 
TCGACGGGAA 
AC!GCCX3^TTT 
GQTTTC6AAA 
ACTCTGCCTG 
ACTACGACAC 
GCAATGCTGA 
CCTGC3VCCAC 
ACCGGGACAA 
ACTCGGCGGG 
GTACCAGCGA 
GCGACAA6AA 



GCAGCCCCTG 



I 



I 



GGCAGAGGAA 
GAAATCTCTG 
TGAGCTGGAT 



GATCCAAAAC 
CTTCGCACTG 
AGACATCX3TC 
GGACX3GGCTC 
CGACGATGAC 
CGCaGATGGC 
CACCACCGAC 



TGQQAAACCX: 



GCTCTTCGGC 
GGAGCTGTGC 
GGGCCGCGGA 
GTGGGGCTTC 



CGACGTGAAT 
AACCACCACC 
TGTCCACCCC 
AGGTCCCCCC 
TGCCTGCAAC 
CAAGGATGGG 



CTCATGTACC 
QGCATCCGGC 
ACACCGC»GC 
TCAGAGCGCC 
ACTGCTGGCC 
GTGAACATCT 
AAGTACTGGC 



CTATQTACOQ CTTCACTGAG 
ACCTCTATGG TCCTCGCCCT 
CCACGGCTCC CCCGACGGTC 
CCACAGCTGG CCCCACAGOT 
CTTCTACGGC CACTACTGTG 
TCX5ACGCCAT CGCX3GAGATT 
GATTCTCTGA GGGCAGGGGG 



ATCCAGTTTG 
CTGGCAaVOG 
GAGTTGTGGT 
GCGGCCTGCC 
GGTCXSCTCCG 
TTTGGCTTCT 
TGCCAGTrTC 
TCCGACGQCT 
TTCTGCCCGA 
GTCTTCCCCT 
GATGGGCGCC 
TGCCCGGACC 
GGCTTAGATC 



GAACCTGA6C 



1680 
1740 
1800 
1860 



2280 
2340 
2400 
24fi0 
2520 
2580 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 



On>ILSI.SNI QQKSVRKItOI. 
GKNESSyrVK VIQDFFTGHW 
DKRMLVYKSB DIKNVSRt<QS 
MKHTCKIiLW ADHRFYHYMG 



QDFDMGTLGL AYVGSPRANS 
LVTTHEUaiN FQAEHDPDGL 
TIESKAQECF QERSNKVCSI 
CCKNCQPETA QKKCQEAINA 
PCEREQQLES CAOJETDNSC - 
CEICRVQOVIE RFWDFIDQLS 



GTCCTGGTGC TCCTGGTGCT 
CTTGTGCTCT TCCCTGGAGA 
TACCTGTACC GCTATGGTTA 
GGGCCTGCGC TGCTGCTTCT 
AGOSCCACGC TGAAGGCCAT 
CAAACCTTTG AGGGOSACCT 
TACTCGGAAG ACTTGCOGCG 



OCTTTCCTCC 
CCCTGGGCAA 
ACTTCCCCTT 
ACXSGCTTGCX: 



TCACTTTCCT 
TCTGGTGCX3C 
AAGGATACAG 
ATTCCTCAGT 
TGCATAAGQA 
CAOGGCCICC 



1080 
1140 
1200 
1260 

1380 
1440 
1500 
1560 
1620 
1680 



397 



wo 02/086443 

CCTTATOSCC GACAAGTGGC 
GCTCTCCAAG AAGCTTTTCT 
GGTGCTGGGC CCGAGGCX3TC 
CGGQGCCCTC CCSOAGTGQCA 
GTTCBACGTG AAGGCGCAGA 
CCCCXMGGTG CCTTTGGACA 
CCAGGACCGC TTCTACTGGC 
GGGGTACGTG ACCTATGACA 
GCAGTQCC»T GTAAATCCCC 
CAAACTGGTA TTCTGTTCTG 
TCACCTTTGT TTTTTGTTGG 
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CCGCGCTGCC 
TCTTCTCTCG 
TGGACAAGCT 
GGGGGAAGAT 
TGGTGGATCC 
CGCACGACGT 
GOSTGAGTTC 
TCCTGCAGTG 
ACTGGGACCA 
GAGGAAAGGG 
AGTGTTTCTA 



GACTCGGTCT 
TGGGTGTACA 
GCCGACGTGG 
AGCGGGCGGC 



ACCCIGGGGA 
AGQAGTGGAG 
ATAAACTTGG 



CGAOAGAAAG 
TTGRACCAGG 
TAGGGCTCCC 



GCCTCTGGAG 
ACCGGATGTT 
CCTATTTCTG 
TGGACCAAST 
GTCCTGCTTT 
TTGCCGGATA 
CCCTCTCTTC 
CTTT 



1920 
1980 
2040 
2100 
2160 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



MSIiWQPLVLV 
RGESXSLGPA 
ITYWIQNYSE 



ACTTDGRSDG 
CTSEGRGDGR 
PMYRFTEGPP 
PTAGPTGPPS 



I 

LI.VU3CCFAA 
LLLLQKQLSL 
DLPRAVIDDA 
APPPGPGIQG 
DGLPWCSTTA 



LWCATTSNPD 
LKKDDVNGIR 
AGPTGPPTAG 
QGPFIiIAQKW 
AOVTGAIiRSQ 
AYFCQDRFYW 



PRQRQSTLVIi 
PETGELDSAT 
FAHAFALWSA 
DAHFDODELW 
NYDTDDRPGP 
DRDKLFOPCP 
SDKKWGFCPD 
HI>YGPRPEPE 
PSTATTVPLS 
PALPRKUJSV 



SLGKGVWPT 
CPSERLYTRD 
TRADSTVmCG 
QGYSIiFIiVAA 
PHPPTTTTPQ 



41 
I 

- DRQIiAEEYIiY 
VPDLGRFQTF 
SRDADIVIQF 
RFGMADGAAC 
GNADGKPCQP 



51 
1 

RYGVTRVAEM 
BGOLKNHHHN 
GVAEHGDQYP 
HPPFIPEGRS 
PPIFQGQSYS 
FTFLGKEYST 
HSSVPEALMY 



RVSSRSBLNQ V 



33Q I.yLFKtl6KXH 540 
FEEFLSXKLF FFSGRQVHVY TGASVliGFRR 600 
} MVDPRSASEV DRMFFGVPU) 660 



Seq ID NO: 560 DMA sequence 

Nucleic Acid Accession 8: IJM_000213.1 

Coding sequence: 127.. 5385 ~ 



11 



21 



31 



51 



AC6GAGTGT6 
CGGCQCTGCA 
GTCATGOAGA 
AQCCAGATGT 
GM3CTGGAGQ 
TCCAACTCXA 
GTCCTGAGCC 
AGCGTCCCGC 

AAACTGCAGG 
ATCCTQCAGA 
CTGGTCTTCT 
GGCATCATOA 
TACAGGACAC 
ATC3VTCCCCA 
TATTTCCCTG 
CTGGAGQAGG 
CGAGGCCTTC 
CACATCCGGC 



I I I I . 

CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCSAGTCCG 
GGTCCAGOAC GGGCGCACAG CAGCAGCCGA GGCTGGCCGG GAOAGGGAGG 
CAGGGCCACa CXXTAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 
TCTCTGGGAC CTTGGCAAAC CGCTGC3UU3A AGGCCCCAOT GAAGAGCTGC 
TCCSTGTQGA TAAGGACIOC QCCTACTGCA CABAOSAGAT 6TTCAGGGAC 
ACACCCAOGC GGAQCTGCra GCCQCGQGCT GCCAGCGGGA. 6AGCATCGT6 



TGTTTGAGCC ACTGGAGAGC CCOGTGGACC TGTACATCCT CaiGGACTTC 540 
TGTCCGATGA TCTGGACAAC CTCAAGAAGA TQGGGCAOAA CCTGGCTCGG 600 
A6CTCACCAG OSACTACy^CT ATTGGATTTG QCaUMnrrGT GOACAAAQTC 660 
AGAC3GGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGT6AC 720 
CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCS3GAAT 780 
GAGAGCX3GAT CXCAGGCAAC CTGGATGCTC CTGAGGGOGG CTTCGATGCC 840 
CAGCTGTGT6 CACGAGGGAC ATTGGCTGGC GCCCGGACAO CACCCACCTG 900 
CCACOGAGTC AGCCTTCCAC TATOnGGCTG ATGGC6CCAA OGTGCTGGCT 960 
CACCTGGACA 
ACCCTG6T6C 
TCCTATAGCT 
CAGGAGGACT 
AACCTGGACA 



AGGACTACCC 



TCTCCTCACT 
CCTTCAATCG 
GGACAGAGGT 
GGGGGGAAGT 
ACGTGTGCCA 



GTCGGTGCCC 
CACCAACTAC 
GGGGGTGCTG 



GCCTGCTCQC CRAGCACAAC 
ACTACGAGAA GCTTCACACC 
CGTCCAACAT CGTGGAGCTG 
TCCGGGCCCT AGACAGCCCC 
AQACGAGGAC TGGGTCCTTT 
TGC5GGGCCCT TGAGCACGTG 
GCAACATCTA TCTGRAACCT 
QTGAX6TGTG CACCTGCGAG 



TGTGT6TGCA 
GACATTCAOC 
CAGTGCGGGC 
GACAACTTCC 
ATGGGCCAGT 
AATGCCACCT 



GCGAGGGCTG GAGTGGCXaG 



AGTGTCCCCG 
GTGTGTGTGA 
GCATCGACAG 
ACTOCCACCA 



CTACGGCGAA 
CACTTCCGGG 
GCCTGGTTGO 
CAATGGGGGC 



ACX3GCRACr 
GACAAGCCGT 
GGCXXSCTAOG 



GCTCCACCGG 



: ACOOGGGCCT CIGCGAGGAC 



CTCCTCCTCC 
TGCAAGGCCT 
GAAQACCACT 



ACUVAGAAGGG. 
AGAGAGCCXSA 
AcaWSCTACAC 
AGAAGAAGGA 
TGCOGCrCCT 
GCCTGGCftCT 
ACATGCTGCG 
GGAACCTCAA 



CTACGCTCCT 
GAGGAATGCA 
OTGOQCTGCT 



ATQACOSAGG 
GCTGTGACrG 
GACGTGGCCA 
CXaVTCTGCGA 



GAGT606CCC 
TCOSGTGTAC 
CAAOACCACA 
CTGAAGCTTA 
GGCTACTACA 



CAGCIGCTGG 



TQCQCCTQGC 
AGCTGCGCCA 
ACAAGCTCCA 
CCATTGTGQA 
CAGAGAAGCA 
CCCrCACTGC 
TGGACGTACG 
TGGAGGCCAT 



CTGCCCTCOG 
GGCCCTGCTA 
TCTCCCGTGC 
GGAGAACCra 
GGGCCXSTGAC 
TCATGCCX3CC 
CCGCCTTTGC 
GGAGGTGGAG 
GCAGACCAAG 
CACAOTGCTG 
GGTGGAACAG 
AGACCAGGAC 
GGTGCCCCTC 



GGCTCCTTCT 
CTGCTGCTAT 
TGCAACCGAG 
ATGGCCTCXO 
GTGGTCCGCT 
ACCATCAACC 
ACCGAQAACC 
GAGAACCTGA 
TTCOGGCAGC 
ATGQCGCCCC 



ACTTCAAGGT 
CCTTCCSGGA 
CTGGaCXX»A 
GGT6GCTCAT 
GCTGGAAGTA 
GTCACATGGT 
ACCACTTGOA 
GGAAGGTCAC 



TQCTGAAGCC 
ACGAGGTCTA 
AGCCCAATGC 
GCTCGGCCAA 
ACGACiCTCAA 
TGGTGGA6TT 
CTGAGGATGA 



CTCTCTGAGT 
TGGGGAGTGC 
CTGGGAGTAT 
ACGCTGCTCC 
TCCCCTCAGC 
CTGTGAGTOT 
GATCAACTAC 
CCAGGCX3TGG 
CAAGATGOTQ 
OGAGGATGAC 
CaGCACTGTC 

ccoxrracTc 

CTGTGCCTGC 
GGGCm-AAG 
CACQCCCATQ 
CAACAACATG 
GGTGCCCTAC 
TGACACTOGG 
CAGGCAGATC 



Gcooacccio 



2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 

2700 
2760 
2820 
2880 
2940 
3000 
3060 
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GTAAACATCA CCATCATCAA QOAGCAAGCC AGAOAaSTGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCTCX3GTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGOAC 3180 

GGCXK3GAAGT CCCAGGTCTC CTACCX3CACA CAGGATGGCA CCGCGCAGGG CAACCGGQAC 3240 

TACATCCCCG TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 3300 

5 GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCIGC GGGGCOGCCA GGTCXXKrCGT 3360 

TTCCACGTCX: AGCTCAGCAA OCCTAAGTTT GGGGCCX3^CC TOGGCOWSCC CCRCTCCACC 3420 

ACCATCATCA TCaWGGGACCC AGATSAACTO GACGGGAGCT TCACQAGTCA QATaTTQTCA 3480 

TCACAGCCAC CCCCTCACGG OSACCTSGGC GCCCXX3CAGA AOOOCAATOC TAAGSCCGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTQ CCCCCTTCTO GCARGCCAAT GGGGTACAGG 3600 

10 GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTCGA CAGCAAGGTG 3660 

CCCTCAGTGG AGCTCACCAA CCTGTACCCG TATTGCGACT RTGAGATGAA GGTGTGCGCC 3720 

TAOGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

QTGCCCAGCX3 AGCXAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCAOGGT GACCCAGCTG 3840 

AGCTGGGCTO AGCCGGCTGA GACCAACGQT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 

15 CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAQA AAQTOCIGGT TQACA ACCCT 3960 

AACAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAOTCCC AGCCCTACOG CTACAC6QTG 4020 

AAGGCX3CGCA ACXSGGGCCOQ CTGaaaQCCT QAGCGGGAGQ CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTAOGACAG CTTCCTTATG TACAGOGATQ ACGTTCTAOG CTCTCCATOG 4200 

20 GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGGCAGCAC CAACTCCCTG CACAGGATGA CCACX»ACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 

ACACX3GGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTCGACCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC CGTCTCCTCC CACGACTCTC OCCTGACTGC TGGTGTGCCC 4500 

25 GACACGCCCA CCCGCCTGGT GTTCTCTQCC CIGGCGCCCA CATCTCTCAa AOTGAGCTOG 4560 

CAGGAGCCGC GGTGCGAGCG GCCGCTGCAG GGCTACAGTG TQGAtSTACCA GCTGCIOAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTCGGT GGTGGTG6AA 4680 .... 

GACCTCXrroC CCAACCACTC CTACGTGTTC CGCGTGOGGG CCCAGAGCCA GGAAGGCTGG 4740 

GGCCGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCCGCA GAGCCCACTG 4800 

30 TGTCCCCTGC CAQGCTCCGC CTTCACTTTG AGCACTCXX:A GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGOGGCCACG GAGGCCCAAT 4920 

GGGGATATCG TCGGCTACCT GCTTGACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAOA CAGCCCCGAG AGCCGGCTGA CC6TGCCGGQ CCTCAGCGAG 5040 

AACSTGCCCT ACAM3TTCAA. CXSTGCAGGCC AGGACCACTQ ASGOCTTaOa GCCAGAGCGC 5100 

35 GAGGGCATCA TCACCATAOR GTCCCAOGAT GGAGQACCCT TCCOGCAGCT GGGCAGCOST 5160 

GCCGGGCTCT TCCAGCACCC QCTGCAAAGC QAGTACAGCA GCATCACCAC CACCCACACC 5220 

AGCGCCACCG AGCCCTTCCT AGTGGATGGQ CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

GGCGGCTCCC TCACCCGGC» TGTGACX:CAG GAGTTTGTGA GCCGGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACCGCA CCCTGCCCCA 5400 ' 

40 CCCXX33CCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CCGGAGCCTC CTCAGCTACT 5460 

CCATCXrrrGC ACCCCTGGGO GCCCAGCCCA CCOGCATGCA CAGAGCAGGG GCTAG GTGTC SS20 

TCCTGGGAtSG CATGAAGGGG GCAAQGTCOQ TCCTCTGTGG 6CCCAAACCT ATTTGTAACC 5580 

AAAOAGCIGG GAGCAGCACA AGOACCCRGC CTTTGrrCTQ CRCTTAATAA ATOOTTTTGC 5640 
^ _ TACTG 

45 

Seq ID HO: 
Protein Ac 

1 11 21 31 41 51 

50 I I 1 I- I I 

MAGPRPSPWA RLIOAALISV SLSGTLAHRC KKAPVKSCTB CVRVDKDCAY CTDEMFRDHR 60 

CNTQAEIiLAA GCQRESIVVM ESSFQITEET QIDTTIiRRSQ NSPQGIAVRIi RPGBERHFBIi 120 

EVFBPLESPV DLYIU4DFSN SMSDDLDHUC KHGQNIARVL SQLTSOyTIG FGKFVDKVSV 180' 

PQTDMRFEEOi KEPWPNSDPP FSPKHVISIiT EDVDEPRUXL QGERISGNU) APEGGFDAIIi 240 

55 QTAVCTRDIG WRPDSTHItLV FSTESAFHYE ADGANVUVGI MSHNDERCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNII PIFAVTNVSY SYYEKLHTYP PVSSUSVLQE DSSMIVELLE 360 

EAFNRIRSNL DIRALDSPRG LRTBVTSKMF QKTRTGSFHI RRGEVGIYQV QIiRAIiEHVDG 420 

THVCQLPEDQ KGNIHLKPSF SDGLKMDAGI ICDVCTCEW KBVRSARCSF HGDFVOGQCV 480 

C2BGWSGQTC NCSTGSLSDI QPCLREGEDK PCSGHGECQC GHCVCYGBGR YEGQFCBYDN 540 

60 FQCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCBCGR 600 

CWCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQCQAWQT QEKKGRTCEE CNFKVKMVDE 660 

LKHABEWVH CSFRDEDDDC TYSYTMEGDO APGPNSTVLV HKKKDCPPGS FWWI^IPLIiLt. 720 

LLPLLALLLL IjCWKYCACCK ACLAIiLPCOJ RGHMVGFKED HYMLREMLMA SIJHU3TPMIJR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI HPTELVPYGI. SLRiARIiCTE NIiIiKPDTKBC 840 

65 AQLRQEVEEN LNEVYRQISG VHKLC3QTKFR QQPNACaCKQD RTIVDTVUilA PRSAKPALUC 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVBFQBGVB LVOVRVPLFI RPEDDDEKQL 960 

LVEAIDVPAG TATLGRRLVN ITIIKEQARD WSPBQPEFS VSRGDQVARI PVIRRVUSGO 1020 

KSQVSYHTQD GTAQCTIRDYI PVEGEIiLFQP GEAWKEIiQVK LLELQEVDSIa LROTQVRRFH 1080 

VQLSNPKFGA HLGQPHSTTI IIRDPDELDR SFTSQMLSSQ PPPHGDLGAP QNPNAKAAGS 1140 

70 RKIHFNWI.PP SOraWGYRVK VMIQGDSESE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGEGPYSSIi VSCRTHQEVP SBPGRLAFHV VSSTVTQLSW AEPAETNGEI TAYEVCYGI.V 1260 

NDDNRPIGPM KKVLVDNPKN RMU.IEN1.RE SQPYRYTVKA HKGAGWGPBR EAIIKIATQP 1320 

KRPMSIPIIP DIPIVDAQSG EDYDSFLMYS BDVLRSPSGS QRPSVSDDTE HLVKCTMDFA 1380 

FPOSTH8UIR MTTTSAAAYG THLSPHVPHR VI.STSSTLTR OYNSLTRSEH SHSTTLPBDY 1440 

75 STI.T8VSSKD SRI.TAGVPOT PTRLVFSALG PTSUWSHQB PRCERSLQSX SVEVQIjUIGG 1500 

BUIRLNIPNP AQTSVWEDIj [iFNHSYVFKV SAQSQEGMGR EREGVITIBS QVBPQSPIiCP 1S60 

IiPGSAFTLST PSAPOPLVFT RLSPDaWlLS HBRPRRPHGD rVGYIiVTCBH AQQ6GPATAF 1620 

RVDGDSPESR LTVPGLSENV PYKFKVQART TEGPGPBRBQ IITIESQDGQ PFPQLGSRAG 1680 

LFQHPLQSEY SSITTTHTSA TEPFLVDGPT LGAQKLEAGG 8LTRBVTQEF VSRTLTTSGT 1740 
LSTHMDQQFF QT 



80 
85 



Seg ID NO: 562 DMA sequence 
NUdelc Acid Accession 8: HH_ 
Coding eegvience: 1. .63 
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GCACGAGGGC GCTTTTGTCT CCC3GTGAGTT TTGTGGCGGG AAGCTTCTGC GCTGGTGCTT 60 

AGTAACCGAC TTTCCTCOGG ACTCCTGCAC GACXTTGCTCC TACAGCCGGC GATCCACTCC 120 

OGQCTGTTCC CCCGGAGGGT CCAGAGGCCT TTCAGAAGGA GAAGGCAGCT CTGTTTCTCT 180 

GCAGAGGAGT AGGOTCCTTT CAQCCATOAA GCATOTOTTG AACCTCTACX: TGTTAGGTGT 240 

5 GGTACTGACC CTACTCTCCA TCTTCGrCAQ AGTGATaOAG TCCCTAGAAO GCTTACTAGA 3O0 

GAGCCCATCG CCTGGGACCT CCTGGACCAC CAQAAGCCAA CTAGCCAACA CAOACCCCAC 360 

CAAGGGCCTT CCAGACCATC CATCCSWaAG CATOTGATAA GACCTCCTTC CATACTGQCC 420 

ATATTTTGGA ACACTGACXTT AGACATGTCC AGATGGGAGT CCCATTCCTA GCAGACAAGC 480 

TGAGCACCGT TGTAACCAGA GAACTATTAC TAOaCCTTGA AGAACCTGTC TAACTGGATG 540 

10 CTCATTGCCT GGGCAAGGCC TGTTTAGGCC GGTTGCGGTG QCTCATGCCT GTAATCCTAG 600 

CACTTTGGGA GGCTGAGGTG GGTGGATCAC CTGAGGTCAG GAGTTCGAGA CCAGCCTCGC 660 

CAACATGGCG AAACCCCATC TCTACTAAAA ATACAAAAQT TAGCTGGGTG TGGTGGCAGA 720 

GGCCTGTAAT CXCAGTTCCT TGGGAGGCTG AGGCGGGAGA ATTGCTTGAA OXXSGGGACG 780 

GAGGTTGCAG TGAACCGAGA TCGCACTGCT OTACCCAGCC TGGGCCACAG TGCAAGACTC 840 

15 CATCTCAAAA AAAAAAAGAA AAGAAAAAGC CTGTTTAATG CACASGTGTG AGTGGATTGC 900 

TTATGGCTAT GAGATACGTT GATCTOOCCC TTACOCXBOG OTCTGGTOTA TGCTGTGCTT 960 

TCCTCAGCAO TATGGCTCTO ACATCTCTTA GATGTCCCAA CTTCaCCTGT TGGGAGATGG 1020 

TGATATTTTC AACCCTACTT CCTAAACATC TCTCTGGOaT TCCTTTAQTC TTOAATBTCT lOflO 

TATGCTCAAT TATTTGGTGT TQAGCCTCTC TTCCACAAOA QCTOCTCCAT GTTTGOATAa 1140 

20 CAGTTGAAGA GGTTGTGTGa GTGGGOTGTT GOGAGTGAGG ATGGAGTGTT CAGTGCCCAT 1200 

TTCTCATTTT ACATTTTAAA GTCGTTCCTC CAACATAGTG TGTATTGGTC TGAAGOGGGT 1260 

GGTGGGATGC CAAAGCCTGC TCAAGTTATG GACATTGTGG CCACCATGTG GCTTAAATGA 1320 

TTTTTTCTAA CTAATAAAGT GGAATATATA TTTOVAAAAA AAAAAAAAAA AA 

25 Seq ID NO I 563 Protein sequence 
Protein Accession #t NP_037464.l 

30 



Seq ID MO: 564 1 
Mucleic Acid Accession #s 11M_023915.1 
35 Coding sequence: 250.. 1326 



I i I I I i 

GGCACGAGGG TTTCX3TrTTC ATGCTTTACC AGAAAATCCA CTTCCCTGCC GACCtTAGTT 60 

TCAAAGCTTA TTCTTAATTA GAGACAAGAA ACCrGTTTCA ACTTGAAGAC ACCGTATGAG 120 

GTGAATGGAC AGCCAGCCAC CACAATGAAA GAAATCAAAC CAGGAATAAC CTATGCTQAA 180 

CCXaVCGCCTC AATCGTCCCC AAGTGTTTCC TGACAOOCAT CTTTGCTTAC AGTGCATCAC 240 

AACTGAAGAA TGG6GTTCAA CTTOAGGCTT QCAAAATTAC CAAATAACGA GCTG CACCG C 300 

CAAOAGAGTC ACAATTCAGQ CAACAGQAGC GACGGGCCAG GABAGAACAC CACCCTTCAC 360 

AATGAATTTQ ACACAATTGT CTTCCCGGTG CTTTATCTCA TTATATTTGT GGCAAGCATC 420 

TTGCTGAATG GTTTAGCAGT GTGGATCTTC TTCCACATTA GGAATAAAAC CAGCTTCATA 480 

TTCTATCTCA AAAACATAQT GGTTGCAGAC CTCATAATGA CGCTGACATT TCCATTTOQA 540 

ATAGTCCaVTG ATGCAGGATT TGGACCTTGG TACTTCAAGT TTATTCTCTQ CAOATACSICT 600 

TCAGTTTTGT TTTATGCAAA CATGTATACT TCCATCGTGT TCCTTGGGCT GATAAGC34.TT 660 

GATCGCTATC TGAAGQTQGT CAAGCCATTT QGGGACTCTC GGATGTACAG CATAACCTTC 720 

ACGAAGGTTT TATCTGTTTG TGTTTGGGTG ATCATGGCTG TTTTGTCTTT GCCAAACATC 780 

ATCCTGACAA ATGGTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 

CCTTTGGGGG TCAAATGGCA TACGGC»GTC ACCTATGTQA ACAGCTGCTT GTTTGTGGCC 900 

GTGCTGGTGA TTCTGATCGG ATGTTACATA GCCATATCCA GGTACATCCA CAAATCC3VGC 960 

AGGCaUVTTCA TAAGTCAGTC AAGCCOAAAG OSAAAACATA ACCAGAOCAT CAGGGTTqTT 1020 

GTGGCTGTGT TTTTTACCTQ CTTTCTACCA TATCACTTQT GCAOAATTCC TTTTACTTTT 1080 

AGTCACTTAG ACAGGCTTTT AOATQAATCT GCACAAAAAA TCCXATATTA CTSCAAAOAA 1140 

ATTACACTTT TCrTGTCTGC GTOTAATaTT TQCCIGOATC CAATAATTTA CTTTTTCAia 1200 

TGTAGGTCAT TTTCAAGAAG GCTGTTCAAA AAATCAAATA TCAGAACCAG GAGTQAAAGC 1260 

ATCAGATCAC TGCAAAGTGT CAGAAGATCG GAAGTTCXSCA TATATTATGA TTACACTOAT 1320 

GTGTAGGCCT rTTATTGTTr GTTGGAATCX3 ATATOTACAA AOTQTAAATA AATGTTTCTT 1380 
TTCATTATCC TTAAAAAAAA AA 



1 11 21 31 41 51 

I I I I I I 

MGFHLTLAKL PKNKLHGQES HNSQNRSDGP GKNTTLHNEF OTIVLPVLYIi IIFVASILLN 

70 GLAVMIFFHI RNKTSPIPYL KNIWADLIM TLTFPFRIVH DAGFGPWYFK PILCSYTSVL 
FYAiniYTSIV PLGLISIDRY LKWKPFGDS RMVSITETKV LSVCVWVIMA VLSLPNIIIjT 
NGQPTEDNIH DCSKLKSPIiO VKWHTAVTYV NSC1.FVAVI.V ILIGCYIAIS RYIHKSSRQF 
ISQSSRKRKH NQSIRWVAV FFTCFLPYHL CRIPFTFSHI. DRLLDESAQK ILYYCKBITL 

_ FLSACNVCU3 PIIYPPMCRS FSRRLFKKSN IRTRSBSIRS LOSVRRSEVR ITfYDYTDV 

75 

Seq ID NO: 566 DNA sequence 

Nucleic Acid Accession «: liM_005365.1 

Coding sequence: 1..948 

80 1 11 21 31 41 51 

I i I I I I 

ATGTCTCTCQ AGCAGAGOA6 TCCGCACTGC AAGCCTGATG AAGACCTTGA AGCXX»AGGA 
GAaOACITGQ GCCTGATGaa -TGCACAGQAA CCCACRGGOG AGaAOGAGGA GACTACCTCC 
TCCTCTGACA GCAAGQASGA GGAGGTOTCT GCXGCIGGGT CATCAAGTCC TCCOCAGAGT 

85 CCICAQQGAa aOQCTTCCTC CTCCATTTCC eiCXACTACR CTTTATQQAO CCAATTCGAT 
QAGGGCTCCA GCAGTCAAGA AGAGOAAGAO CCAAGCTCCT CGSTOSACCC AGCTCASCTG 
GAGrrCATGT TCCAAGAAOC ACTOAAATTO AAOGTGGCTa AGTTGGTTCA TTTOCTQCTC 



400 
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CACAAATATC GAGTCAAGQA GCCGGTCACA AAGGCAGAAA TQCrGGAGAG CX3TC3VTCAAA 420 

AATTACAAC3C GCTACTTTCC TGTGATCTTC GGCAAAGCXTT CCGAGTTCAT GCACGTGATC 480 

TTTGGCACTG ATGTGAAGGA GGTGGACCCC OCCGGCCACT CCTACATCCT TGTCACTGCT 540 

CTTGGCCTCT CGTGCGATAG CATGCTGGGT GATGGTCATA GCATGCCCAA GGCCGCCCTC 600 

CTSATCATTO TCCIGGGTGT GATCXTTAACC AAAGACAACT GOGCCCCTOA AGAGGTTATC 660 

TCGGAAGCGT TGAQTGTGAT GGGGGTGTAT GTTGGGAAI3G AGCACATGTT CTACGGGGAG 720 

CCCAGGAAGC TGCTCACCCA AGATTGOGTC CM3GAAAACT ACCTGGA6TA CCGGCAGGTO 780 

CCCGGCAGTG ATCXnOOGCA CTAOGAQTTC CIGTGGGGTT CCAAGGCCCA OGCTQAAACC 840 

AGCTATGAGA AGGTCATAAA TTATTTGOTC ATGCTCAATQ CAAGAGAQCC CATCTGCTAC 900 
CCATCCCTTT ATGAAGAGGT TTTGGGAQAa GAGCAAGAG6 GAGTCTGA 



I I I I 1 I 

MSLEQSSPHC KPOEOIiEAQG EDLGIMQAQE FTGEEBBTTS SSOSKEEEVS AAGSSSPPQS 

PQGGAS8SI8 VYYTLWSQFD ECSSSQBEEE FSSSVDPAQL EFMFQEALKL KVAELVHFUi 

HXYRVKEPVT ECAEMLESVIK NYKHYFPVIF GKASEFMQVI FBTDVKEVDP ACTSYILVTA 

LGLSCDSMLG DGHSMPKAAL, LIIVLGVILT KDNCAPEEVI WEALSVMGVY VaKEHMFYGE 

FRKLLTQDWV QENYIiEYKQV POSDPAHlfBF LWaSKAHAET SYBIWIMYLV MLNASEPICY 
PSLYEEVLGE EQEOV 



Coding sequence ; 86 . . 1126 

1 " 11 21 31 41 51 

30 GGTTACTCAT CCTGGGCTCA GGTAAGAGGG CCCGAGCTCG GAGGCGGCAC ACtX3W3GGGG 60 

GACX3CCAAGG GAGCAGGACG GAGCCATGGA CCCCGCCAGG AAAGCAGQTG CCCAGGCCAT 120 

GATCTGGACT GCAGGCTGGC TGCTGCTGCT GCTGCXTCGC GQAGGAGCGC AGGCCCTOOA 180 

GTGCTACAGC TGOGTOCAGA AAOCAGATOA OQGATGCTCX: CCOAACAAGA TGAAOACAOT 240 

GAAGTGCGOG COGGG06TGO AOGTCTGOVC CQAGGCCGTa CX3GGGGOT60 AOACCATCCA 300 

35 COGACAATTC TOGCTGGCAO TGCSGGGTTG CGGTTCGGGA CTCCXXX3QCA AGAA30ACC0 360 

CGGCCTGGAT CTTCACGGGC TTCTGGCXJTT CATCCRGCTG CAGCAAT60G CTCAGGATCG 420 

CTGCAACX5CC AAGCTCAACC TCACCTCGCG GGCGCTCGAC CCGGCAGGTA ATGAGAGTGC 480 

ATACCCGCCC AACGGCGTGG AGTGCTACAG CTGTGTGGGC CTGAGCCGGG AGGCGTGCCA 540 

GGGTACATCG CCGCCGGTCG TGAGCTGCTA CAACGCCAGC aATCATGTCT ACAAGGGCTG 600 

40 CTTCGAC3GGC AACXTTCACCT TGACGGCAGC TAATGTGACT GTGTCCTTGC CTGTCCX3GGG 660 

CTGTGTCCAG GATGAATTCT GCACTCGGGA TGGAGTAACA GGCCCAGGGT TCACGCTCAG 720 

TGGCTCCTGT TGCCAGGGGT CCOGCTGTAA CTCTGACCTC OGCAACAAGA CCTACTTCTC 780 

CCCTCGAATC CCACCCCTTO TCOQOCTCiCC CCCTCCAGAG CCCACGACTG TCGCCTCAAC 840 

CACATCTGTC ACCACTTCTA CCTCQGCCCC AOTCMGACCC ACATCC31CCA CCAAACCCAT 900 

45 GCCAGCGCCA ACCAGTCAGA CTCCQAQACA GGGAOTAGAA CACGAGGCCT CCCGGGATGA 960 

GGAGCCX31GG TTGACT6GAG GCGCCGCTGG CCACCAGGAC CGCAGCAATT CAGGGCSMSTA 1020 

TCCXGCAAAA GGGGGGCCCC AGCA6CCCCA TAATAAAGGC TGTGTGGCTC CCAC3«jCTGG 1080. 

ATTGGCAGCC CTTCTGTTGG CCX3TGGCTGC TGGTGTCCTA CTGTG&GCTT C TCCACCIBQ 1140 

AAATTTCCCT CTCACCTACT TCTCTGGCEC TGGGTACCCC TCXTCTCATC ACTTCCTGTT 1200 

50 CCCACCACTG GACTGGGCTG GCCCAGCCCC TGTTTTTCCA AC3VTTCCCCA GTATCCCCAG 1260 

CTTCTGCTGC GCTGGTTTGC GGCTTTGGGA AAXAAAATAC CGTTGTATAT ATTCTGGCAG 1320 

GGGTGTTCTA QCTTTTTGAQ GACAGCTCCT GTATCXTTTCT CATCCTTGTC TCTCCGCTTG 1380 

TCCTCTTGTG ATGTTAGGAC AGAGT6AGAG AAGTCAGCTG TCA CGGGG AA GGTQAGAGAQ 1440 

AGGATGCTAA GCTTCCTACT CSVCTTTCTCC TAGCC3MSCCT GGACTTTGGA GCGTGGGGTG ISOO 

55 GGTGGGACAA TGGCTCCCCA CTCTAAaCAC TGCXTTCCCCT ACTCCCOGCA TCTTTGGGGA IS 60 

ATCGGTTCCC CATATGTCTT CXTTTACTAGA CTGTGAGCrC CTOQA6GQCA GGQAC06TGC 1620 

CTTATOTCTG TOTGTGATCA GTTTCTGGCA CATAAATGCC TCAATAAAGA TTTAArTACT 1680 
TTGTATAGTG AAAAAAAA 



1 11 21 31 41 51 

65 itDPARKAGAQ AMIWTAGHLL LIiLUlGGAOA I.ECXSCVQKA DDGCSPHKMK TVKCAPGVDV 60 

CTEAVGAVBT IHGQFSLAVX GCGSOLF6KN DSOLDLHGliL AFIQLQQC3V0 DRCHAKINIiT 120 

SSALOPAGNE SAYPPN6VEC TSCVGLSRBA CQGT5PPWS CYHA SDHVYK GCFIXSIVTLT 180 

AANVrVSLPV RGCVQDBFCT RIX3VTGPGFT LSGSCCQGSR CNSOIiKMtei'i; PSPRIPPLVR 240 

tPPPEPTTVA STTSVTTSTS APVRPT9TTK PMPAPTSQTP ROGVEHEASR DSBPRIiTGOA 300 

70 AGHQDRSMSG QYPAKGGPQQ PHNKGCVAPT AGLAALLIAV AAGVUi 

Seq ID NO: 570 DNA sequence 
Nucleic Acid Accession #i NM_00S329.1 
coding sequence : 1 . . 1662 

1 11 21 31 41 51 

ATGCCXXITGC AGCTGACGAC AGCCCTGOGT GTGGTGGGCA CCAGCCTGTT TQCCCTGGCA 60 

_ - GTGCTGGQTG GCATCCTGGC AGCCTATGTG ACGGGCTACC AGTTCATCCA CACGGAAAAG 120 

80 CACTACCTGT CCTTCGGCCT GTACGGCGCC ATCCTGGGCC TGCaCCTQCT CATTCAGAGC 180 

CTTTTTGCCT TCCTGGAGCA CCGGCGCATG CGACGTGCCQ GCCAGGCCCT GAAGCTGCCC 240 

TCCCCGOGGC GGGGCTOGGT GGCACTGTGC ATTGCOGCBT ACCAGGAGGA CCCTGACTAC 300 

rraCGCAAGT GCCTGCGCTC GGCCCAG06C ATCTCCTTCC CTGACCTCAA GGTGGTCATa 360 

GTCGIGQATG GCAACCOCCA GGAGQACGCC TACATGCTOG ACATCTTCCA CBAQGTGCTG 420 
85 GGCOSCACCG AGCAGSCOOa g r T C r r TGTO TGGCGCRGCA ACTTCCATGA GOCAGGCGfg 
GGTGAGACGG AGGCCAGCCT OCAGGAQ) ~ 
AGCACCTTCr CGTGCKICAT GCAGAAS 



r GTACACGGCC 600 
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TTCAAGGCCC TOSGCGATTC 
GATCCAGCCT GCACCATCGA 
GTCGGGGGAG ATGTCC3U3AT 
GTGCGGTACT GGATGGCCTT 
CAGTGTATTA GTGGGCCCTT 
GACTGGTACC AT,CAGAAGTT 
ACX»ACCGAG TCCTGAGCCT 
ACAGAGACCC CCACTAAGTA 
TACTTCCX3GG AGTGGCTCTA 
TACGAGTCAG TGGTCACXMG 
TTCTACCGGG GCCGCATCTQ 
ATTATCAAGG CCACCTAOSC 
CTCTACTCCC TCCTCTATAT 
ATCAACAAAT CTGQCTGGGa 
CTCATTCCTG TGTCCATCTG 
TGCCAGGACC TGTTCAGTGA 
GGCTGCTACT GGGTGGCCCT 
AAGAAGCCGG AGCAGTACAG 



PCTAJS02/12476 



GGTGGACTAC 
GATGCTTCGA 
CCTCAACAAG 



GGGCATGTAC 
CCTAGGCAGC 
TGGCTACCGA 
CCTCCGGTGG 
CAACTCTCTG 
•CTTCTTCCCC 
QAACATTCTC 
Cl-GCTTCCTT 
GTCCAGCCTT 
CACCTCTGGC 



ATCCaVGOTOT 
GTCCT6QAGG 
TAOGACTCAT 
CGGGCCIGCC 
OGCAACAGCC 
AAOTQCAGCT 
ACTAAQTATA 
CTCAACCaOC 
TCGTICCKVA 
TTCTTCCTCA 
CTCTTCCTGC 
CGGGGCAATG 
CTGCCGGCCA 



GCQACTCTGA 
A6GATCCCCA 
GGATTTCCTT 
AGTCCTACTT 
TCCTCCaGCA 



CTTGGCTTTT 



CTCCTGOGAG 
QCCnCCTTG 
TATCTOGCCA 
GCTGAGGTGT 



AAACCOGCTO 
AGCACCACCT 
TTGCCACG6T 
TGACGGTGCA 
CAGAGATGAT 
AGATCTTTGC 
TTGTGGTGAA 
GGCTGGCCTA 



GTTCCTGGAG 
COG6CACCTC 
CAAGTGCCTC 



CTGGATGACC 
TATACAGCTT 
GCTGGTGGGC 
CTTC»TGTCC 
CATTGCTACC 
CTTCATTGGC 
CACAGCTTAT 
lATACIOXAT 
GOC3ATGTGGG 



1020 

loao 

1140 
1200 
1260 
1320 
1380 

1500 
1560 
lfi20 



25 

30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



WGTSI.FALA 
RBAGQAUCLP 
YMLDIFHBVL 
GGKREVMVTA 



VLGGILAAW 
SFRRGSVALC 
GGTEQAQPFV 
FKALGDSVDY 



I 

MPVOLTTALR 
LFAFI£HRRM 
WDGNRQEDA 
STFSCIMQKW 
VGGDVQILNK 
DWYHQKFLGS 
YFREWLYKSL 
IIKATYACPIi 
UPV8IWVAV 
KKPEOYSLAF 



Seq ID NO I S72 DNA aequence 
Nucleic Acid Accession ft: Eos sequence 
148-7095 



K. Hni9FGLyQA ILGI1HI.LIQS 



LLGOIAYTAY OQOLF8ETSL 



IQVCDSDTVIi 
RAOQSYFGCV 
TKSTARSKCI. 
FFIiIATVIQL 
liPAKIFAIAT 
AFIiVSQAILY 



GBTEASIiQEQ 
DPACTIEMLR 
QCISGPLGMY 
TBTPTKYLRW 
FYRGRIMNIL 



GCyWVALLMIi 



VIiBEDPQVGQ 
RNSLLQQFI.E 
LMQQTRWSKS 
LFLLTVQIiVa 
RKTIWNFIG 
YIAIIASROa 



CACACATAG6 CACXX»OQAT C 



CAGCTCCTCT 
CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATG 



QATTTCAAAQ 
TTAOATCCAT 
AATGGCTCAT 
ACAQTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 
TGGGAAAGAC 



GGTQCTATTC 
TGCACTAATG 
AATCCTGAAC 



ACGAAATACA 
AAGGGTGATG 
ACAGAAAAAG 
GAAGGTACTT 
AACTTGTCGG 



CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 
GAGTTTCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
CGATTATTGA 
TCATACTGTT 
TGACATCTCC 
TCTCTGAAAQ 
TCATGCTGAT 

AGGTGTrrrc 

CAGAAAATGT 
CTCGAGTCGT 
GAGAGGACCA 
TCSiATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 
ATGAA6CCAA 



TCTGGAAATG 
CCTGGATTGQ 
GTCCTATACA 
CCCaUkAAC3KA 
GAAACTTAAA 
TGGGAAAACA 



GCTAATGGAT 
GGAGCACTGA 
TCTCCTATCA 



GXGGAAATTA 



TQGATCAGAG 
TC3A.T60GGAC 
TTTATCCATT 
TGGAfrrCQAA 



ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAGA 
CCAGTTTCAA 
CTATCCXaiTT 



QTQTGGTTrC 
AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 



GTATACAATG 
ACCCCTTTQT 
TCGGCCTTGC 
TCTTCCTATG 
TTTCGCCATC 
GATAAGGTGC 



CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTQCCTA 
AGGATTTGGT 



TGCTTGACAA 
ATGCTACGCC 
ATGGTGCACC 
TGCATACAGT 
CCTTGCATQC 



TCCCT6CACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 
AACXaWVGCAT 
GCTACCXaAT 
AAAATACAGC 
CCCTGAATTA 
AGAAOGOGCr 
ACCCCAQATT 
GACTAACCGA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATAT6ATGTC 
TTCAGAAGAA 
AOACATAACA 
CACTGAGATA 
GATGTCACAG 
CTTCCCAACT 
CTCCACGGTC 
TCTTCAACCT 
TCAGATCCTC 
TGTATTTOCC 
TTTGCTTCCA 
TTCTCAAATC 
TTCTCTQCCA 



CATA6TTTAG 
OGATTTTCAA 
TTGTTTGAGG 
AGTGTTAOTC 
CCAAACTCAA 
GACACAGTTG 



ACTAC3USACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 



TTGGGGAAAG 



ATCATTGGAA 420 



CAAAACAATT 
GGAAAGOAAG 
CCAGAGAATT 
ATGATTGAGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTG6AACT6 
ATTSraAATC 
TCIACCACAA 



GTTTTGAGGA 
TTGGGACAGA 
QTTTTGGGAA 
CTCACAAGTA 
ACTGGATTGT 
AAGTTCTTAC 
TTGGAQAGCA 
AGATTCATGA 
ATACCAGCCT 



TCACTQGGGA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 



CAGATGGCTA TCAAGACTTG 
TTCTTCAGAT AGTAGCCATA 
TTGTCGACAT GCCTACTGAT 
AAGAAATAAT CAAGGAGGAG 



TCCACTTCCC AACCAGTCAC 
ACTGTGACTG AACTGCCACX 
TCTAAAACTG TTCTTAGATC 
ACAGTTTCTA TAACAGAATA 
GGAGCTOAAG ATTCTTCAGO 
AACATATCXS: AAGGGTATAT 
CTTATACCAG AATCTGCIAO 



OGTGTTGATG 
G6TCCCTCAG 
GAGGTAACAC 
AACGTGGTAT 
TCCTACAGTA 
AACACTACOC 
AGTQT0QAX6 
TTTTCCTCTQ 
CTTCCACAAO 



ATGTTGGATC 
AATCTGAGAA 
TTACAGATCr 
CTCATGCTTT 
ACTCGCAGAC 
GTGAAGTCTT 



ATTCTCTGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TGAGQAGOAG 
CTCCASTCCC 
ATTTTCCTCC 
AAATOCTTOC 
GGAGQQAAAT 



QACAACCAAG 
GGAAATCOCA 
TACCXC3^TCC 
AACCCAACCO 
TCCTCTAGTC 



TOTCATTTOA ATCCRTCCTG 
CTTCCTTCAQ TAGTQAATTO 
TTACTTCACX: TACOAGAGT 
OTGATTtOCT ATTAflAfiCXX: 



1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
ISSO 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
21G0 
2220 



2460 
2520 
2580 
2640 
2700 
2760 
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AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACX3CTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TQAACCACCC 2880 

AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTR TGCCTTGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCX; GCCCTAGCCA TATACCAATA 3060 

CCTAAQTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAQTGAAT TTCTTTTACC TGACRCMAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTiWS CTGAATTrAC ATATACAACA 3240 

TCT6TGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TQGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAO CACAGTCATG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGOSTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCXXMTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC A GAGC CAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAOCTCTT ATTTTATGAQ 3660 

ACCTCAGCTT CTTTTAGTAC TQAAOTATTO CTACAACCTT CCTTTCAGGC TTCTGATGrr 3720 

QACMCTTGC TTAAAACTGT TCTTCCAGCT GTSCXCAGTQ ATCCaVATATP GQTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCa 3B40 

RGTGAAAACA TGCTGCACTC lACATCTGTA CCASTTTTTG ATGTGTCGCC TACTTCTCAT 3 900 

ATGOVCTCTG CTTCACTTCA AGOTTTGACC ATTTOCTATG CAAGTGAGAA ATATGAACCav 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCRA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 

TTGTTCCAAA OGGCCAATTT GGAOATTAAC CAGGCCCATC COCCAAAAGQ AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTOATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA ASTTCTGTTA CTGGTAAGQT ATTTGCJXIGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACIGATC A-rrCTGTTOC TATMGRART 4260 

GGGCATGTTG CCATTACAGC TGTITCTCMC CACAOAiSATG GTTCTOTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAO CTGAGTCATA GTGCX3WUVTC TGATGCMGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGOTQACACT GATGATGATQ GTGATGATGA TGATGATQAC 4440 

AGAGGTAGTG ATGGCTTATC CATTCATAAG TGTATGTCAT GCTCATCXTTA TAGAGAATCA 4S00 

CAGGAAAAGG TAATGAATGA TTCAGAC»CC CAOGAAAACa GTCTTATGGA TCAGAATAAT 4S60 

CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CRCCATCRGC AAATGGGCTA 4680 

TCCCAAAAOC AGAATGATGG AAAAOAGOAA AATGACATTC AQACTQGTAG TGCTCraCTT 4740 

CCTCTCAGCC CTOAATCTAA AOCATGGGCA GTTCIGACAA QIQATOAAlQA AA STOQAT CA 4800 

GGGCAAGQTA CCTCAGATAG CCTTAATGRG AATOAGACTT CCACAGATTT CASTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGGGATC CTGGCaGCAO QTOACTCAGA AATAACTCCT 4920 

GGATTCCCAC AGTCCCXKAC ATCATCTGTT ACTAGOSAGA ACTCAGAAGT QTTCCACQTT 4980 

TCAGAGGCAG AGGCX»GTAA TAGTAGCCaT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 

CTAOTGOTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 5160 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGAT6 ATGTOGQAGC AATTCCAATA AAQCACTTTC CAAASCATGT TOCAGATTTA 5280 

CATGCAAGTA GTGGGTTTAC TQAAGAATTT GAOACACCaA AAGAGTTTTA OCAOQAA6TG 5340 

CAGAGCTOTA CTOTTQACTT AGGTATTACA GCAGACAGCT CCAACCACCC AOACAACAAG 5400 

CACAAGAATC GATACATAAA TATOQTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAQ 5460 

CTTGCTGAAA AGGATGGCAA ACTOACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTCCCCaA GGCCCACTGA AATCCACAGC TGAAGATTTC 5380 

TGGAGAATCA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCOTGGAQ 5640 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCOGATG QGAGTGAGGA GTAgGGGAAC 5700 

TTTCIGGTCA CTCAGAAOAQ TGrGCAAOTG CTTGCXTTATT ATACTGTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACOTGTGGTC 5820 

ACACAOTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTSACCTTTG TGAOAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTajTC 5940 

CACTGCAOTQ CTGGAaTTGG AAQAACAGGC ACATATATTG TGCTAGACaO TATGTTGCAG 6000 

CAGATTCAAC ACGAAGOAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCOTTCACAA 6060 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACXOOTTGAa 6120 

GOaTACTTA GTAAAGAAAC TGAGGTGCTG GACaiGTCATA TTCATGCCXA 'Wl'l'AMXSCA 6180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA OCTCCTQAQC 6240 

CAOTC3UUITA TACAGCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCaA CAGGQAAAAG 6300 

AATOGAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTGGaVTTTC ATCCCTGAGT 6360 

CQAQAAGGCA CAGACTACAT CAATGCCTCC TATATCATQG QCTATTACCA OAGGRATOAA 6420 

TTCATCATTA CCCAGCSVCCC TCTCX^TTCAT ACCATCAAGG ATTTCTGSAG QATGATATGO 6480 

GACCATAAT6 CCCAACTTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6S40 

TTTGTTTACT GGCCAAATAA AGATGAGCCT ATAAATTGTQ AGAGCTTTAA QGTCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC T TAIAA TTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATOA TTATGTACTT GAAGTOAGGC ACTTTCAGTQ TCCTAAATQG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAQAAOAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATQATOAGC ATOGAGOAOr GACGGCAGQA 6840 

ACTTTCTGTG CTCTGACaAC CCTTATGCAC CAACTAQAAA A AGAAA ATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTOATG AGGCCAGGAG TCTTTGCTGA CATTGAGCaG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCTWSCA TTGCCTGATQ OAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAO TTTAACACAG AAAGGGGTGG OQGGACTCAC ATCTQAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAGGCAGQAA AATCA6ICTA GTTCIOTTAT CTGITGATTT CCCATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCT6C CGOCAAATTT ATATCATTAA CA ATOtG TGC 7260 

CmCTTGCAA GACTTGTAAT TTACTTATTA TOTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG JATGGAATTG TGQTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCKA 7380 

TTTATAQAGG TTAfiGAATTC C»AACTACAQ AAAATOTTTG TTTTTAGTGT CAAATnTTA 7440 

GCTOTATTTG TAGCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TQATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC rGTTACTTAT TeTAAATACT GCCCTAGTGT CTCCR TGQAC CA A ATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAQ TTTTCTAGTT CTGTGTAATT 7680 
GTTTAOTTTA ATGAOGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC T6ACATTGTA 7740 
•XTQTGITACC TAAfiTCATTA ACTTTGTTTC AOCATGTAAT TrTAACTTTT OTGGAAAATA 7800 
GAAATACXTT CATTTTBAAA GRAGTTTTTA TGASAATAAC ACCTTACCAA ACATTGITCa 7860 
AA-reaTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTOCCaTTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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PCT/US02/12476 



MHILKRFLAC 
QSPINIDEDL 
FKASKITFHW 
IIiFEVSTEEN U^FKAIIDGV E 
TDTVDHIVFK DTVSISESQti Jl 



31 

I 

KIiVEEIGWSY 
EMTFIHNT6K 
LEMQIYCPDA 
ALOPFZLLNIi 
QSGYVMLMDY 



LIGTBEIIKE EEEGKDIBBG 



LGAIUmLLP 
EEEGKDIBBG 
GKGDVPNTSL 



AIVMPGRDSA 
HSTSQPVTKI. 
NTVSITBYEE 
VLIPESARNA 



ICTNGLTCKY 
TNQIRKKEPQ 
ATEKDISLTS 
ESIiTSFKIiD 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



TEVTPHAFTP SSItQQDIiVST VNWYSQTTQ 



PVSVAEFTYT 
IiNASIiQETSV 
KPVLSAHSEP 
AVPSDPILVE 
TISYASEKYE 
EPUITLIHKL 



IPKSSLITPT 
TSVFGDDNKA 
SISSTKGMFP 



ASIiUJPTKAL 
LSKSEIIYGH 
GSLAHTTTKV 
LSPSTQLLFY 



FVYKGETPLQ 
LSSYDGAPLli 
PSIAQYSDVL 
8DNEGSQHIF 



ETELQIPSFM 
FDHEISQVPE 
ETSASFSTEV 



41 
1 

TGAIiNQKiniG 
TVEINLTNDY 
DRFSSFEEAV 
LPWSTDKYYI 
LQNNFREQQY 
TMIEKFAVLY 
SDQLrVDMPT 
ISTTTHYNRI 
QTVTEUPHT 
TGAEDSSGSS 
ESLKDPSMEG 
QGPSVTDLEM 
PSYSSEVPPL 
PFSSASFSSE 
STTHAASBTL 
TVSYSSAIPV 
SOSEFLLPDT 



TPKVDKISST 
PVUiKSESSH QWPSLYaiD E 



NNPSVQPTHT 
LLQPSFQASD 
VPVFDVSPT3 



KKYPTOJSPK CO 

RVSGGVSEMV 120 

KGKGKLRAI.S 180 

YNQSLTSPPC 240 



1020 
1080 
1140 



VISTPPTPIF 



QGPLKSTAED 
VIiAYYTVRNF 
AKRHAVGPW 
QYVFIHDTLV 



SQEKVMNDSD 
LSQKHKDGKE 
ADTOBKDADG 
IfSEKKAVIP 
PISODVGAIP 
KHKMRYINIV 
FHRMIWEHNV 
TIiRNTKIKKG 
VHCSAGVGRT 
EAILSKETEV 
KNRTSSIIPV 
WDmiAQLWM 



THENSLMDQH NPISYSIiSEH 



VSTDHSVPIG 
TDDDGDDDDD 
SEEDNRVTSV 



GTKYNEAKTN 
VEGTSASLMD 
PATSAIPFIS 
NVWFPSSTDI 
PHYSTFAYFP 
VTPIiLLDNQI 
LFRHUJTVSQ 
BFGSBSGVLY 
HDSVGVTYQG 
DGLTALNISS 
MPNMYDNVNK 
VSQASGDTSL 
VDTLLKTVLP 
HKHSASLQGL 
VFATPVIiSID 
NGHVAITAVS 



ENDIQTGSAL 
ILAAGDSEIT 
LVIVSALTFI 
IKHPPKHVAD 



IiPIiSPESKAH AVIiTSOEBSG SGQGTSOSUI 



EVIVMITNLV 
SQKQRPSGRV 
GTYIVIiDSMIi 
UJSHIHAYVN 



QLAEKDGKLT 



XMRXCFQTAH 
FBTtiXBFYQE 
DYIMANYVDG 



VTQYHYTQWP 
QQIQHBGTVM 
AJjLIPGPAGK 



DMGVPEYSI.P 
IFGFTiKHIRS 
TKI>EKQFQIiL 
SYIHQYYQSH 



VQSCTVDU3I 
YHRPKAYIAA 
NFLVTQKSVQ 
VLTFVBKAAY 
QRNYLVQTEE 



EPIITQHPLI. 



VHDEHGGVTA GTFCALTTLM £ 
SbVSTRQBEN PSTSLDSHGA ALEDONUES VSSUV 

Seq ID NO: 574 DNA sequence 

Nucleic Acid AccesBion 8: Eos sequence 

Coding sequence! 140-4518 



1560 
1620 

leao 

1740 

1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



CAGCTCCTCT 
CTTQTTGAAG 
AAATATCCAA 
CAAGTAAATG 



ATTTCCTTCG 
CCGCRQACCG 
GTGTTTGCCX3 
AGATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCRTAACSW: 



I 

CTCACTTOQA 
CTCCCCCTCC 
TCTGGAAATQ 
(XTGGATTGG 
GTCCTATACA 
CCCAAAACAA 



TCTATACACT GGAGGATTAA AACAAACAAA 



TGGGAAAACA 



CGAATCCTAA 
GCTAATQGAT 
GGAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTGGAAATTA 
AAAGCAA6CA 



AGCGTTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACITT 



GATTTCAAAQ 
TTAGATCCAT 
AATGGCCCAT 
ACAQTTAGCA 



TTCTCTAQAC 
AGTTCAOAAC 
TGGGAAAGAC 
CAGTTGGATG 
GGTGCTATTC 
TGCACTAATG 
AATCCTGAAC 



ACAGAAAAAO 
GAAGGTACTT 
AACTTGTCGG 
AGTTTATTQA 
GCAACTTCTG 
GAAAACXX»G 
GAAGATTCAA 



TCTACTGCTT 
AGTTAAQAGC 
CGATTATTGA 
TCATACTGTT 
TGACATCTCC 
TCTCTQAAAG 
TCATGCTGAT 



TGATGCGGAC 
TTTATCCATT 
TGGAGTOGAA 
GAACCTTCTG 
TCCCroCACA 
CCAGTTGGCT 



OQATTTTCAA 
TTGTTTGAGG 
AGTGTTAQTC 
CCAAACTCAA 
GACACROTTG 



GTTTTGAGGA 



CAGAAAATGT 



CTCATACACT 
TCRS6CIGAC 
TTATGATAOC 



TCAATAATTT 
GCTTATATGQ 
TTGATCTTTT 
AAGACATTGA 



ATATTTOCTT 
CAGCCTCTTT 
GGACTGCAGA 



CTATCCCATT 
AGACAATAAC 
CTTCATC3U3G 



\ ACCCCAQATT 
GACTAACCQA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATATGATQTC 
TTCAGAAGAA 



CAAAACAATT 
GGAAAGGAAG 
CCAGAGAATT 
AT6ATTGAGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 
ATTGTGAATC 
TCTRCCACMA 
TCCCCAACAA 
TOCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCXX: 
CTTATACCAG 
TCACTAAAGG 



CTGACAAGTA 
ACTGQATTGT 
AAGTTCTTAC 
TTCGAGA6CA 
AGATTCSVTGA 
ATACCAGCCT 
AGTTTGCAGT 
CAQRTGGCTA 
TTCTTCAGAT 
TTGTCGACAT 
AAGAAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAGTQA 
AACX3M3TCAC 

aactgcx:acc 

TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 



CGCTTGCATT 
ACAGAGAAAA 
TTGGGQAAAQ 
AGATCTTACA 
ATCATTGGAA 
TGACTACGGT 
TCACTGGGCA 
ATTTCCACTT 
AQCAGTCAAA 
AGAIUU^TTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAOAT 
AATGCAACAA 
ACAGTACAAG 
AGCAQTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TCAAGACTTG 



GCCTACTQAT 
CAAGGAGOAG 
CAGTGCTACA 
TCGCATAGGG 
ATTCTCTGGA 
TAAATTAQCC 
TCACACTGTG 
TCCACATATG 



ATCCTTCTAT GOAGGGAAAT 



1140 
1200 
1260 
1320 
1380 



1800 
1860 
1920 
1980 
2040 
2100 
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GTGTGGTTrC CTAGCTCTAC AGACATAACA GCACAGCCOS ATQTTGQATC AaGCAOAOAG 2160 

AGCTTTCTCX: AGACTAATTA CACTGAGATA CGTGTTGATG AATCKJAQAA GACAACCAA6 2220 

TCCTTTTCTG CAGGCCCAGT GATQTCACAG GGTCCCTCAG TTAC3M5ATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACX:CCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCC3VCGGTC AAOSTGGTAT ACTCGCAOAC AACCCAACCG 2400 

GTATACAATG CAGAGGCCAQ TAATAGTAGC CATQAOTCTC GTATTOOTCT AG CIGAG GGG 2460 

TTGGAATCXX3 AGAAGAAGGC AGTTATACCC CTTQTGATCG TCTCAGCCXTT GACTTTTATC 2S20 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 2S80 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA C3VCCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCX3G AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAQTGGGTT TACTGAAGAA TTTCAOACAC TGAAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCr GTACTGTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCC3«3ACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAQCASG6T TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAQGCCCAC TGAAATCCAC AQCTGAAGAT 3000 

TTCTGGAGAA TGATATGGGA ACATAATOTG GAAQTTATTG TCATGATAAC AAACCT06T6 3060 

GAGAAAGGAA GOAGAAAATG TGATCAGTAC TGGCCTQCCG ATGOGAOTOA GOAOTACGGG 3120 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGOAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAOGGC TCCCAGAAAG GAAGACCCAG TGGACGTGTG 3240 

GTCaCACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GTGCTQACCT TTGTGAGAAA GGCAGCCTAT GCCAAGOGCC ATGCAGTGGG GCCTGTTGTC 3360 

QTCCACTGCA GTGCTGQAGT TGGAAQAACA GGCACATATA TTCTGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTQTCAAC ATATTTOGCT TCTTAAAACA CATCXWrTCA 3480 

CAAAGAAATT ATTTGGTACA AACTOAGGAG CAATATGTCT TCATTCATGA TA CACTG GTT 3540 

GAGGCCATAC TTAGTAAAGA AACTOAQGia CIGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAfi AGAAACAATT CCAGCTCCTG 3660 

AQCX3M3TCAA ATATACAGCA GAGTGACTAT TCTGCAGCCX: TAAAGCAATG CaACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTGGCAT TTCATCCXTTG 3780 

7U3TGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

OAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATQ ATTCCTGATG GCXSAAACAT GGCAGAAQAT 3960 

aAATTTGTTT ACTGGCCAAA TAAAGATOAG CCTATAAATT GTGAGAGCTT TAAGGT CACT 4020 

CTTATGGCIG AAGAACACAA ATOTCiaTCT AATQAGGAAA AACT TATAA T TCaWOACTTT 4080 

ATCTTAGAAG CTACACAGQA TCATTATGTA CTTSAAGTGA QGCaCTTTCA GTGtCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTACTAAA ACTTTTOAAC TTKXAA6TGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATOATG AGCATGGAGG AGTGACGGCA 4260 

GGAACTTTCT GTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCAG GAGTCTTTGC TGACATTGAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGT6A GCaCAAGGCA GGAAGAGAAT 4440 

CCATCCACCT CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGAGC 4500 

TTAOAGTCTT TAGTTTAACA CAGAAAGGGG TGGGOGGACT CACATCTGAG CATTGTTTTC 4560 

CrCTTCCTAA AATTAGGCAQ GAAAATCAGT CTAGTTCTGT TATCTGTTQA TTTCCCATCA 4620 

CCTOACAGTA ACTTTCATOA CATAGGATTC TCCCGCCAAA TTTATATCAT TAACA ATOT G 4680 

TGOCTTTTTG CAAGACTTOT AATTTACTTA XTATGTTTOA ACTAAAATGA TtGAATTTTA 4740 

C3VGTATTTCT AAGAATGGAA TTQTGGTATT TTTTTCTGTA TTQATTTTAA CAQAAAATTT 4800 

CAATTTATAG AGQTTAGGAA TTCCRAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCTCTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTGCAGT ATTCACCTAA 4980 

AGTAQAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAQ TG TCTCC ATG GACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA crACTOAQTC AAGTTTTCTA GTTCTGTGTA SlOO 

ATTOTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

aTATTGTGTT ACCTAAOTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGXGGAAA 5220 

ATA6AAATAC CTTCATTTTG AAAGAASTTT TTATGAGAAT AACACCTTAC CAAACATTQT 5280 

TCAAATOGTT TTTATCCMAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



Seq ID »Oi 57S Protein aequencei 
Protein Accession 8: Eos sequence 

1 11 21 31 41 51 

MRIIiKRPIAC IQLLCVCaUjD WANGYYRQQR KI.VBEIGWSY TGALNQKNWG KKXPTCHSPK 60 

QSPINIDHJL TQVMVMUOa. KFQGWDKTSL EUTFIHNTQK TVElNbTMDY RVS6GVSEHV 120 

FKASKITFHW GKCNMSSDGS BHSIiKSQKFP LEMQITfCFDA DRFSSFEEAV KGKGKLIIAI.S 180 

ItPEVGTEEN LDFKAIIDGV ESVSRFGKQA AUJPPILLini IiPHSTDXYYI TOGStTSPPC 340 

TDTVBWIVFK DTVSISESQL AVPCEVI.TMQ QSQYVMLMDY LQNNPREQQY KFSKQVFSSY 300 

TGKEEIHEAV CSSBPESIVQA DPEHYTSLLV TWEHPRWVD TMIEKFAVLY OQUJSEDQTK 360 

BBFlaTDGYQD LGAIUOILLP NMSYVMJIVA ICTNGLYGKY SDQLIVDMPT DNPELDLPPE 420 

UOTBEIIKB EEEQKDIEEQ AIVNPGRDSA TNQIRKKEPQ ISTTTHTORI GTKYKBAKTN 480 

BSFTBGSBFS GKOOVFilTSL HSTSQPVTKIi ATBKDISIjTS QTVTEbPFilT VEGTSASIjND 540 

GSKTVIJISPR MMLSOTAESIi NTVSITBYEE ESLLTSFKLD TQAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESAMJA SmSTSSQSE BSIiKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGH ESPLQTmTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PEYSTPAYFP 720 

TEVTPKAFTP SSRQQDLVST VMWYSQTTQ PVYNAEASNS SHESRIGLAE QLESEKKAVI 780 

PLVIVSALTP ICLWIjVGIL IYWHKCFQTA HFYIiSDSTSP RVISTPPTPI FPISDDVGAI 840 

PTKHFPKHVA DLHASSGFTE EPETLKEFYQ EVQSCTVDLG ITADSSHHPD NKHKNRYIHI 900 

VAYDHSRVKt AQLAEKDQKL TDYINAHWD GYNRPKAYIA AQGPI.KSTAE DFWHMIWEHN 960 

VEVIVMITHI. VEKGBRKCDQ YWPADGSEEY GHFLVTQKSV QVLAYYTVRN PTLRMTKIKK 1020 

GSQKGRPSGE WTQYHYTQW PDMGVPEYSI. PVLTFVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVtDSM LQQIQHEGTV NIPQFIiKKIR SQRNYIiVQTE EQYVFIHDTL VBAIIiSKETE 1140 

VLDSHIHAYV HAIilPOPAG KTXLEKQFQL I.SQSNIQQSD YSAALKQCNR ERNRTSSIIP 1200 

VERSRV6ISS LSGBGTDYIN ASYIMSYYQ3 NEFIITQHPL L HTIKD FWRM IHDHHAQI>W 1260 

MIPDGQNMAE DEFWWPHKD EPINCBSFIW IbMASEBKCL SNEBKLZIQD FILEATQDDY 1320 

VIiEVRHFQCP KWPNPDSPIS KTFELISVIK SBAANRDOPM IVHDEHG6VT AGTFCALTTL 1380 

MHQIiEKENSV DVYQVAKMIN LMRP6VFADI BQYQFLYKVI LSLVSTRQEB NPSTSLDSKG 1440 
AALPDGNIAE 8I1ESI.V 
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Seg ID NOt 576 UNA sequence 

Nucleic Acid AccesBlon ff: EOS sequence 

Coding sequence: 148-4494 

1 XI 21 31 41 51 

I I I I I I 

CACRCATACa CACGCACGAT CTCACTTCGA TCTATACACT GC5AGGATTAA AACARACAAA 60 

a«AAAAAAC ATTTCCTTOG CTOCCCCTCC CTCTCCACTC TGAGAAGCAG AC3GAGCCGCA 120 

CGGCGAGGGO CCGCAGAC06 TCTGGAAATG OGAATCCTAA AGCGrrTCCT CGCTTGCATT 180 

C»GCTCCTCT GTOTTTGCOa CCTOOATTGG GCTAATGQAT ACTACAGACA ACAGAGAAAA 240 

CTTOTTGAAG AGATTCGCTG GTCCTATACR GGAGCACTGA ATOUiAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTr GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCA CTAA TGACTAOCGT 480 

QTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCXaCTT 600 

GAGATCCAAA TCTACTGCTT TGATGCAGAC CQATTTTCAft GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

QATTTCAAAO CGATTATTGA TGGAQTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAiQATCCAT TCRTACTGTT OAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATCGCTCAT TGACATCTCC TCCCTGCACA QACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT tfrrm ' TG TG AAOTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCaiTQCTGAT GGACTACTTA CAAAACAATT TTCaAGAGCA ACAGXACTlAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATOR A OCAGITTC T 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTOOAGTCXjT TTATGATACC ATGATTGAGA AGTTTGCAST TTTCTACCAG 1200 

CAGTTO6ATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

QGTGCTATTC TCAATAATTT OCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATO QCTTATATOG AAAATACAGC GACCAACTGA TTGTCX3ACAT GCCTACTGAT 1380 

AATCCT6AAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

OAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACaA CACACTACRA TOJCaVTAGGG 1S60 

ACGAAATACA ATGAAGCCAA GACTAACCX3A TCCCCAACAA OAGGAASTGA ATTCTCTGQA 162.0 

AAGGQTGATQ TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACC3KGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATtTCCTT GACTTCTCAG ACTGTGACTQ AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTCA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATrTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAOAAGAA TCACTAAAGG ATCCTT CTAT GGAGGGAAAT 2100 

OTGTGGTTTC CTAQCTCTAC AGACATAACA GCACAGCCOG ATGTTOGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGROATA CSTGrTOATG AATCTGAGRA CftCAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT OATGTCACAG GGTCCCTCAQ TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTrTGCCTA CTTCCCRACT GAGGTARCAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC RACCCAACCG 2400 

GTATACAATG AGGCCRGTAA TAGTAGCCXT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCOGAGA AGAAGOCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 2S2 0 

CTAQTGGTTC TTGTGGGTAT TCTCATCTAC TOGAGGAAAT GCTTOCAGAC TGCACACTTT 2580 

TACTTAGAGG ACROTACATC CXXrCAOAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAORTG ATGTCQSAGC RATTCCRRTR AAQCACITTC CAAAlGCATGT TGCAGATTTA 2700 

CRTGCRAGTA GTGGGTTTAC TGAAGARTTT OAGGRRSIGC AORGCTOTAC TGTTGACTTA 2760 

GGTATTACAG CAGACRGCTC CAACCACCCR 6ACRACAAGC ACAAGAATOG ATACATAAAT 2B20 

ATCGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC TTGCTGAAAA GGATGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA ACAGACC3UVA AGCTTATATT 2940 

GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT GGAGAATGAT ATGGGAACAT 3000 

AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CRGTRCTGGC CTGCCGRTGG GRGTGRGGAG TACGGGAACT TTCTGGTCAC TCAGAAGAGT 3120 

GTGCAAGTGC •TTQCCTATTA TACTOTQAGG ARTTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AeAAAGGAAfl ACCCAGTCGA OGTGTGGTCA C»C3«3TATaV CTACAasCAG 3240 

TGGCC IG RCA TOGGAaiACC AGAGTACrCC CIGCCAGTGC TGACCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA AGCGCCATGC AGTOGGGOCT GTTQTOJTCC ACTGCAGTGC TGGAGTTGQA 3360 

AGAAC3VGGCA CATATATTGT GCTAGACAGT ATGTTGCRGC AGATTCAACA CaAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGAGCAAT ATCTCTTCAT TCATGATACA CTGGTTQAGG CCATACTTAQ TAAAQAAACT 3S40 

GAGGTCCTGG ACAGTCaO'AT TCATGCCTAT GTTAATOCAC TCXTTCATTCC TGQACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC AGTCAAATAT ACAGCAGAGT 3660 

GACXATTCTO CAGCCCTAAA GCAATGCAAC AGGGAAAAGA ATCGAACTTC TTCTATCATC 3720 

GCZGrSGAAA GATCAAGGGT TOGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATOOCrCCT ATATCATGGG CTATTACCAG AGCAATGART TCATCATTAC CXAQCACCCT 3840 

CTCCrrcRTA CCATCAAGGA TTTCTGGAGG ATGATATGGG ACCATAATGC CCAACTGOTG 3900 

QTTATQATTC CTGATGGCCA AAACATGGCA 6AAGATGAAT TTGTTTACTG OCCRAATAAA 3960 

GATGAGCCTA TAAATTGTGA GAGCTTTAAG' GTC3VCTCTTA TGGCTGAAGA ACRCAAATQT 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGARGCTAC ACSffiGaTGRT 4080 

TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAQA TAGCCCCATT 4140 

AGTAAAACTT TT6AACTTAT AAGTGTTATA AAAGAAGRAG CTGCCAATAQ GGATGOGCCT 4200 

ATGATTOITC ATGATGAGCA TCGAGGAGTG AGGGCAGGAA CTTTCTOTOC TCT6RCAACC 4260 

CTTATOCACX: AACTAGAAAA AQAAAATTCC GTGGATGTTT ACCAGGTAGC CAASATGATC 4320 

AATCIGRTGR GGCCS^GGAGT CTTTGCTGAC ATTGAGCA6T ATCAGTTTCT CTACAAAGTG 4380 

ATCCTC»GCC TTGTGAGCAC AACGCAGGAA GAGRATCCAT CCACCTCTCI GORCAGTAAT 4440 

GGTGCRGCRT TGCCTGATGG AAATATAGCT GAGRGCTTAG AGTCTTtAGT TTAACRCAGA 4500 

AAGGGGTGGG GGGACTCACA TCTGAGCKTT QTTTTCCTCT TCCTAAAATT AGGCRGGAAA 4S60 

ATCRGTCTAG TTCTGTTATC TGTTGATTTC CCATCAOCIG ACASTAA CTT TCMGACATA 4620 

GGATTCTGCC QCCAAATTTA TATCATTAAC AATGIGI6CC TTTTTGCRRG ACTTGTAMT 4680 

TACTTATTAT GTrTGAACTA AAATGATTGA ATTTTACRGT ATTTCTARGA ATGGRATTGT 4740 

GOTATTTTTT TCTGTATTGA TTTTAACRGA RRATTTCRRT TTATAGRGGT TAGGRATTCC 4800 
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AAACTACAGA AAATGTTTGT TTTTAGTOTC AARTTTTTRG CTaTATTTGT AGCAATTATC 4860 

AOSTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AGATTTTTAT S040 

ATTTTACTAC TQAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TQACGTAGTT SlOO 

CATTAGCTGG TCTTACTCTA OCAOTTTTCT GACATTOTAT TOTGrTACCT AAGTCATTAA 5160 

CTTTGTrrCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAATACCTTC ATTTTQAAAG 5220 

AAGTTTTTAT 6AGAAXAACA CCTTACCAAA CATTOTTCftA ATGGTTTTTA TCCAAGGAAT 5280 

TQCSUIAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NO: 577 ?coteln sequences 
Protein Accession #: EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCaiLD HAMGVmQQR KLVEEIGHSY TOAIiKQXNHG KKyPTOfSPK 60 

QSPINIDEOL TQVNVNLKKI. KFQGHDKTSL ENTFIBNTOK TVBINLTNDY RVS6GVSBMV 120 

FXASKITFHW GKCNMSSDGS EHSIiEGQKFP LEMQIXCFDA DRFSSFB£»V KGKGKLRAIiS 180 

ILFEVQTEEH LDFKAIIDGV ESVSRFOKQA ALDPFILLNL t.PNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFGEVLTMQ QSGYVMUUDY LQNNFREQQY KFSRQVFSSY 300 

TGKEBIHEAV CSSEPENVQA DPENYTSLLV TWEHPRWYD TMIEKPAVDY QQLDGEDQTK 360 

BEFI.TDGYQO LGAILNNLLP NMSYVLQIVA ICTMGLYGKY SDQLIVDMPT DMPELDLFPE 420 

LIGTEEIIKE EEE6K0IEEG AIVMPGROSA TKQIRKKBPQ ISTTTHYHRI 6TKXNEAKTN 480 

RSPTRGSBFS GXGDVPNTSI. NSTSQPVTia. ATBXDISLTS QTVTBLPPHT VGGTSASLHD 540 

GSKTVUtSPH Htn.SGTAESIi HrVSITBYSE BSUbTSFKLD TGAEOSSGSS PATSAIFFIS 600 

ENISQGYIFS SEHFGTITYD VLIPESARHA SEDSTSSGSB BSUmPSMEG HVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESBKTT KSFSAGPVMS QGPSVTDLEM' PHYSTPAYPP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASMSS HESRIGLAEG LESKKKAVrP 780 

LVIVSALTFI CLVVLVGILI YWRKCFQTAH FYLBDSTSPR VISTPPTPIF PISDDVGAIP 840 

IKHFPKHVAD LKASSGFTEE FBBVQSCTVD LQITADSSKH PranOiKNRYI NIVAYDHSSV 900 

KLAQLAEKDG KLTDYIMANY VDGYNRPKAY lAAQGPLKST AEDFWRMIHE BNVEVIVMIT 960 

NltVEKGRRKC DQYHPADQSB EYGMFLVTQK SVQVLAYYTV SNFTLRNTKI KK6SQK6RPS 1020 

GRWrOYHYT QHPDKQVP&y SLPVI.TFVRK AAYAXRHAVG PVWKCSAGV GRTCmiVU] 1080 

SHLQQIQHEG TWIFGFLXB ISSaBiahVQ TEEQYVFIBD TLVBAIIiSKB TEVLDSHIBA 1140 

YVNAUjIPQP AGKTXLEXQF QU<aQSNIQQ SDYSAAUCQC NRBKNRTSSI IPVGRSRVGI 1200 

SSLSGEGTDY IMASYIMSYY QSHEFIITQH PUiKTIKOFH RMIWOBNAQL WMIPDGONM 1260 

ABDEFVYHPH KDBFINCESF XVTUnEEKK CLSMEBKLII QDFIIjEATQD OYVLEVRHFQ 1320 

CFKHPHPDSP ISKTFELISV IKEEAAHBDQ PMIVHOEKOG VTAGTFCAIiT TLMHQLEKEN 1380 

SVDVYQVAKM mLMRPGVFA DIEQyQFLYK VILSLVSTRQ SENFSTSLOS HGAALPDONI 1440 
AESI.ESLV 



Seq ID NO: 578 DHA sequence 

Sucleic Acid Accession k' BOS sequence 

Coding sequence) 501-4514 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGR TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCtaCTC TGAGAAGCAQ AGGAGCCBCA 120 

CXK3CQAGGGQ CXX3CAGACXIG TCTGGAAATG CQAATCCTAA AOCGTTTCCT CGCTTGCATT 180 

CaSCKXTCT GTOTTTGCCG CCTOGATTCG QCTAATGGAT ACTACAQACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACT6A ATCAAAAAAT TGGGQAAAQA 300 

AATATCCAAC ATGTAATAGC CCAAAACAAT CTCCTATCAA TATTGATGAA GATCTTACAC 360 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTG GGATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACCGTG 480 

TCAGCGGAGG AGTTTCAGAA ATGGTOTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 540 

AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCftAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA OTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 720 

ATTTCAAAGC GATTATTGAT GGAGTCGAAA GTGTTAGTCG TTTTOaOAAG CAGGCTGCTT 780 

TAQATCCATT CATACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACAOTTOA CTGGATTGTT TTTAAAGATA 900 

CftGTTAQCRT CTCTGAAAGC CAGnOSCTQ TTTTTTOT8A AGTTCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTQATG GACTACTTAC AAAACAATTT TCGAGAGCAA CAGTACAAQT 1020 

TCTCTAGACa GGTGTrTTCC TCATACACT6 GAAAGGAAQA GATTCATGAA GCAGTTTGTA 1080 

GTTCAGAACC AGAAAAT6TT CAGGCTGACC CAGAGAATTA TACCAGCXTTT CTTGTTACAT 1140 

QGOAAAQACC TCQAGTCXJTT TATGATACCA TOATTaAGAA GTTTGtaGTT TTGTACCAGC 1200 

AGTTGGATGG AGAGQACCaA ACCAAGCATG AArrTTTQAC AGATGGCTAT CAAGACTTGG 1260 

GTGCTATTCT CAATAATTTG CTACCCAATA TQAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGCG ACX»ACTGAT TGTCGACATQ CCTACTGATA 1380 

ATCCIOAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AOAAATAATC AAGGAGGAGG 1440 

AAQAGGGAAA AGACATTGAA GAAGOOGCTA TroTGAATCC TQOXAGAGAC AGTGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTAOCACAAC ACACTACAAT OGCATAGGOA 1560 

CGAAATACAA TGAAGCX»AG ACTAACOGAT CCCCAACAAG AGGAAGTGAA TTCTCTGQAA 1620 

AGGGOGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACCAGTCACT AAATTAGCCA 1680 

CAOAAAAAGA TATTTCCTTC ACTTCTCAaA CTGTGACTGA ACTGOCACCT CACaCTGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACIGT TCTTAOATCT CC3VCATATGA 1800 

ACTTGTOGGG QACTQCAGAA TCCITAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAQA 1860 

GTTTATTGAC CAGTTTCAAG CTTGAXACTG GAGCTGAAGA TTCTTCAGGC TTCCAGTCCCG 1920 

CMCTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCXJl AGGGTATATA TTTTCCTCCG 1980 

AAAACGCASA GACAATAACA TATQATOTCX: TTATACCAQA ATCTGCTAOA AATGCTTCCG 2040 

AAQKTTCAAC TTCATCAGGT TCASAAGAAT CACTAAASGA TCCTTCIATQ GAGGGAAATG 2100 

TGIGSITTCC TAGCTCTACA GACATAACAO CACAGCCOSA TQTTGGATCA GGOVGAGASA 2160 

GCTTTCTCCA GACIAATTAC ACTGAGATAC GTIJTTGATaA ATCTGAGAAG ACAACCAAGT 2220 

CCTTTTCTGC ASGCCCAGTG ATSTCACAGG GTCCXTTCAGT TACAGATCTG GAAATGCCAC 2280 
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ATTATTCTAC CTTTGCCTAC TTCXX3VACTG AGGTAAC&CC TCATGCTTTT ACCCCATCCT 2340 

CCAGACAACA. GQATTTQGTC TCCACX3GTCIV AaTTOGTATA CTGGCAGACA ACCCAACCGG 2400 

TATACAATQA GGCCAOTAAT AGTAGCCATC AGTCTOGTAT TGGTCTAGCT GAGGGGTTGG 2460 

AATCCX»QAA CBUVGGCAGTT ATACXX:CTTG TGATCOTGTC AGCCCTQACT TTTATCTGTC 2520 

TAGTGGTTCT TGTGGGTATT CTCATCTACT GGAGGAAATG CTTCCAGACT GC ACAC TTTT 2580 

ACTTAGAGOA CAGTACATCC CCTAGAGTTA TATCCACACC TCCAACACCT ATCTTTCCRA 2640 

TTTCAGATGA TGTCGGAGCA ATTCC3VATAA AGCRCTTTCC AAAGCATGTT GCAGATTTAC 2700 

ATGCAAGTAO TGGGTTTACT GRAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAAGIGC 27fi0 

AGAGCTOTAC TOTTGACTTA GGTATTACAG CAGACAGCTC CAACXACCX3V GACAAC3UVGC 2820 

ACAAOAATCO ATACATAAAT ATCGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC 2880 

TTGCTGAAAA GGATGaCSVAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 2940 

ACAGACCAAA AGCTTATATT GCTGCCCAAQ GCCCACTGAA ATCCACAGCT GAAGATTTCT 3000 

GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 

AAGOAAGGAQ AAAAT6TGAT CAGTACTGGC CTGCCGATGG GAGTGAGGAG TACGGGAACT 3120 

TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 3180 

TAAGAAACAC AAAAATAAAA AAGGGCTCCX: AGAAAGGAAG ACCCAGTGGA C3QTGTGGTCR 3240 

CACAGTATCav CTACACGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTQCCABTGC 3300 

TGACCTTTGT GAlGAAAGGCA GCCTATGCCA AGCGCCATGC AGTGGGGCCT G TTGTC GTCC 3360 

ACTGCAOIGC TGGAGTTGGA AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC 3420 

JU3ATTC3UCA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC CXSTTCAO^AA 3480 

GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTOAGG 3540 

CCATACTTAS TAAAOAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 3600 

TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC 3660 

AGTCSAATAT ACAGCAGAGT GACTATTCTG CAGCCCIAAA GCAATGCAAC AGGGAAAAGA 3720 

ATOGAACTTC TTOTATCATC CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTOAGIG 3780 

GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGQ CTATTAOCAO AGCAATGAAT 3840 

-TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 3900 

ACCATAATGC CCAACTGGTG GTTATGATrC CTGRTGGCCA AAAC»TGGCA GAAGATGAAT 3960 

TTGTTTACro GCCAAATAAA GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 4020 

TOGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 4080 

TAQAAOCTAC ACAGGATGAT TATGTACTTO AAGTGAGGCA CTTTCaWJTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAOTGTTATA AAftGAAGAAG 4200 

CTGCCAATAQ GQATGGGCCT ATGATTQTTC ATGATGAGC» TGGAGQAOTQ ACGGCAGOAA 4260 

CTTTCTGTGC TCTGACAACC CTTATGCRCC AACTAGAAAA AGAAARTTOC GTOQATOTTT 4320 

ACCAGOrAGC CAAGATGATC AATCTGATGA GGCOUWAGT CTTTGCTGRC ATTQAGCftGT 4390 

ATCAGTTTCr CTACAAAGTG ATCCTCA6CC TTGTGAGCAC AAGGCAGGAA GAGAATCCRT 4440 

CCACCTCTCT GGACSIGTAAT GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGC TTAG 4500 

AQTCTTTAGT TTAACACAGA AAGGGGTGGQ GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AGGCAGGAAA ATCACrTCTAQ TTCTGTTATC TGTTGATTTC CCATCACCTG 4620 

ACAGTAACrr TCATQACATA GGATTCTGCC GCCAAATTTA TATCATTAAC A ATGTG TGCC 4680 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTrTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGR ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGR AAAT TTCAA T 4800 

TTATAOAGQT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAO 4860 

CTQTATnGT AOCRATTATC AOGrTTQCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 4920 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAOTA 4980 

GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTQTC TCCATGGACC AAATTTATAT 5040 

TTATAATTGT AGATTTTTAT ATTTTACTAC TQAGTCAAGT TTTCTAGTTC TCTOTAAXTG 5100 

TTTAGTTTAA TGACGTAQTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 

TGTGTTACXrr AAGTCATTAA CTTTGTTTCA GCATOTAATT TTAACTTTTG TGGAAAATAG 5220 

AAATACCTTC ATTTTGAAAG AAGTmTAT GAGAATAACA CCTTACCAAA CATTGTTCAA 5280 

ATGGTTTTTA TCC3«CGAAT TCCAAAAATA AATATAAATA -rrGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NOi S79 Protein sequence: 
Protein Accession #: EOS sequence 

1 11 21 31 41 51 

MVFXASKITF HWGKCNMSSD GSEHSI£GQK FPLEMQIYCF DADRFSSFEE AVKGKGKLRA 60 

LSILFEVOTE EHIiDFXAIID GVESVSRFGK OAAlDPFIIi NbLPUSTDRX TflYNGSLTSP 120 

PCIDTVDWIV FXDTVSZSES QIAVFCEVLT HQQSGXVMIW DYLQNNFREQ QYKFSRQVFS 180 

SnOKEBIHS AVCSSEPBIV QADPENyTSI. LVTWBRPRW YDTMIEKFAV LYQQLDGEDQ 240 

TKHBFIiTDa* QDLGAIianH. LPNMSYVLQI VAICTNGI.YG KYS DQIiIVD M PTDNPEIJ3LP 300 

PBLIGTEBII KEEEBGKDIE BGAIVNPGSD SAINQIRKKE PQISTTTHYM RIGTKniEAK 360 

TORSPTRGSB FSGKGDVPMT SIMSTSQFVT lOATBKDISI. TSOTVTSLPP HTVEGTSASL 420 

HUeSKTVIiRS PHMNLSGTAE SLNrVSITBY BEESUiTSFK LDTOAEOSSG SSPATSAIPF 480 

ISENISQGYI FSSENPETIT YDVLIPESAR KASEDSTSSG SEESUOJPSM EQKVWFPSST 540 

DITAQPDVGS GRESPLQTNY TEIRVDESEK TIKSFSAGPV MSQGPSVTDL EMPHYSTPAY 600 

FFTEVTFHAF TPSSHQODLV STVHWYSQT TQPVYNKASN SSHESRIGLA EGLBSBKKAV 660 

IPLVIVSAIiT FIOiWLVQI IiIYHRKCFQT AHFYLEDSTS PRVISTPPTP IFPISDDVGA 720 

IPIKHFPKBV ADU{ASSOFT EEPBTIiKBFY QEVQSCTVDI. GITADSSNHP DNKHKNRYIN 780 

rVAYOHSRVK LAQIAEKDGK LTDYIMANYV DGYNRPKAYI AAQGPLKSTA EDFWHMIHEH 840 

NVEVrVNITH bVEKGRSKCD QYWPAOGSEE YGKFLVTQKS VQVIAYYTVR N^ITLRNTKIK 900 

KGSQKGRPSG RWTOYHYTO WPDMGVPBYS LPVLTFVHKA AYAKHHAVGP VWHCSAGVG 960 

RTGTYIVLDS MliQQIQHEGT VNIFGFUail RSQRNYLVQT .EEQYVFIHDT I.VEAII.SKBT 1020 

EVLDSHIHAY VNALLIPGPA GKTKLEKQFQ LLSQSNIQQS OYSAAUtQCH SBKMRTSSII 1080 

PVER9RVGIS SLSGEGTDYI MASYIMGYYQ SHEFIITQHP LLBTIKDFHR MIWDHKAQLV 1140 

VMIPDOeiMA EDEFVYWPNIf DEPINCESFK VTLMABEHXC I.SIIEBKLIIQ DFII.EATQDD 1200 

yVLEVHHFQC PKWPWPDSPI SETFELISVI KEEAANR06P HZVBDEKGGV TflGTPC3U.TT 1260 

LMaOLBKEHS VDVYQVAKMI NLMRPGVFAD IBQYQPLYKV ILSLVSTHOB EHPSTSLDSN 1320 
GAAIiPDCailA ESLESLV 

Seq ID MOi 580 DNA sequence 

nucleic Acid Accession fft eos sequence 

Coding sequence: 148-4632 

1 11 21 31 41 51 
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I I I I I I 

CACACATACX3 C3VCGCACGAT CTCaVCTTCGA TCTATACACT GGAGGATTAA AACAAACAAA SO 

CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACOS TCTGGAAATO CGAATCCTAA AACGTTTCCT CGCTTGC3VTT 180 

CAGCTCCTCr QTGTTTGCXB CCK3GATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTOTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTQA ATCAAAAAAA TTGGGGAAAG 300 

AAATATOCAA C31TGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATQA AOATCTTACA 360 

CAAQTAAATG TGAATCTTAA GAAACTTAAA TTTCAGQGrT GGGATAAAAC ATCATTGOAA 420 

AACACATTCA TTCaTAACAC TGGQAAAACA OTGOAAATTA ATCTCACTAA TGACTACOGT 480 

GTCAGCGGAG GAGTTTCAGA AATGaTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGQA 540 

AAATGCAATA TGTCATCTGA TGCSATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGAXGCAAA TCTACTGCTT TGATGCGOAC CQATTTTCAA QTTTTGAGGA AGCAGTCAAA 6 SO 

6GAAAAGGGA AGTTAAGAGC TTXATCCATT TTCTTTGAGG TTGOGACAGA AGAAAATTT6 720 

GATTTCAAAO OQATTATTGA TGGAOTOGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTQTT GAACCTTCTG CCAAACTCAA CTOACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACIGGATTGT TTTTAAAQAT 900 

ACAGTIAOCA TCTCTOAAAG CCAaTTGGCT G TTV rf T G TB AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATO TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAOAC AGGTGTTTTC CTCATACACT GGAAAGGAAO AGATTCATGA AGCAGTTTOT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCXST TTATGATACC ATGATTGAGA AGTTTGCaGT TTTGTACCAG 1200 

CAGTTOGATG GAGAQGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

QQTGCTATTC TCAATAATTT GCTACCCAAT ATGAQTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GA<XAACTGA TTGTOGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTQAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAO 1440 

QAA6AGGGAA AAOACATTOA AGAAGGCGCT ATTGTSAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGQA ACCCCAGATI 7CTACX»CAA CACRCTACAA TCGCATAGGG 1560 

ACtaAATACA ATGAAGCCAA GACTAACCOA TCCCCAACAA GAGGAAOTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAQAAAAAO ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTQCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCX3G GGACK3CAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTQA CCAOTTTCAA GCTTQATACT QGAGCTGAAQ ATTCTTCAGG CTCCAGTCCC 1920 

OCAACrrCTO CTATCCCATT CATCTCTGAG AACATATCCC AAGOQTATAT ATTTTCCTCC 1980 

GAAAACCCAC AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCX: 2040 

QAAOATTCaVA CTTCATCAGQ TTcaUSAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 . 

GTGTaGTTTC CTAGCTCTAC AGACATAACA GCACAGCXTOS ATQTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA OGTGTTGATQ AATCTQAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT QATGTCACAG GGTCCCTCAG TTACAG ATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCX:CAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTrGGT CTCCAOGGTC AACGTGGTAT ACTCGCAGAC AACCCakACCG 2400 

GTATACAATQ AGGCCAGTAA TAGTAGCCAT GAGTCTOSTA TTCGTCTAGC TGAGGGGTTG 2460 

GAATCOQAOA AQAAGaCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTOOrTC TTQTGGQTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2S80 

TACTTAGAG6 ACACTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATQ ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGaTTTAC TGAASAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 3760 

CAGAGCTGTA CTOTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AOACAACAAO 2820 

CACAAGAATC GATAC3VTAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCIAGCACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTQAT TATATCAATG CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TQAJM3ATTTC 3000 

TaOAGAATOA TATGGOAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCX5TGGAG 3060 

AAAGGAAGGA OAAAATOTGA TCAGTACTGG CCTGCCGATG GGAaTQAGGA QTACX3GGAAC 3120 

TTTCTCGTCA CTCAGAAGAG TGTGCRAGTG CTTGCCTArX ATACTGTGAG G AATTT TACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGT6G ACGTGTGGTC 3240 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGIACTC CCTGCC AOTS 3300 

CTGACCTTTG TCAGAAAGGC AGCCTATGCC AAGCGCCATG CSMSTGaOGCC TGTTOTOBTC 3360 

CACTGCAGTG CTGGAQTTGG AAGAACAGGC ACATATATTG TGCTAOACAO TATGTTGCAO 3420 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3540 

GCtaVTACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACCCA QGCTGGAGTG CAGAGGCACA ATCTCGGCTC ACTGCAACCT TCCTCTCCCT 3720 

GGCTTAACTG ATCCTCCTAC CTCAGCCTCC CXJAGTGGCTG GQACTATACT CCTGAGCCAG 3780 

TCAAATATAC AGCAOAO-TGA CTATTCTGCA GCCCTAAAiGC AATGCAACAG GOAAAAQAAT 3840 

CGAACTTCTT CZATCATOCC TGTGGAAAGA TC3UK3GQTTG QCATTTCATC CCIGAOTGCA 3900 

GAAGGCACAG ACTACATCAA TQCCTCCTAT ATCATGOaCT ATTACCAGAQ CAATOAATTC 3960 

ATCATTACXX: AGCaOCCTCT CCTTCATAOC ATCBAGGATT TCTGQASGAT GATATGOGAC 4020 

CATAATGCCC AACTGQTGGT TATGATTCCT GATGGCCAAA ACATGGC3«3A AGATGAATTT 4080 

GTTTACTGGC C7UVATAAAGA TGAGCXTTATA AATTGTGAGA GCTTTAAGGT CACTCTTATG 4140 

QCTGAAGAAC ACAAATGTCT ATCTAAl-QAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 42 OO 

GAAGCTACAC AGQATQATTA TGTACTTGAA GTOAaOCaCT TTCAGTGTCC TAAATGGCX» 4260 

AATCXaGATA GCCCCATTAG TAAAACTTTT GAACTTAIAA GTGTTATAAA AGAAOAAGCT 4320 

GCCAATAGGG A1GGGCCTAT GATTGTTCAT GATGAGCATG GAGGM3TGAC GGC3USGAACT 4380 

TTCTGTGCTC TGACAAOCCT TAXGCACCAA CXAOAAAAAQ AAAATTCCQI G6ATGTITAC 4440 

CAGGTAGCCA AGATGATCAA TCIGATQAGG CCAGGAOTCT TTGCIOACAT TGASCASTAX 4S00 

CAOTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGGGCACAA GGCAGGAAGA GAATCX3VTCC 4560 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGOAA ATATAGCTOA GAGCTTAGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTOTTATCrG TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACaVTAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTOCCrT 4800 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 4860 

TTCTAAGAAT GGAATTGTGQ TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAA TTT 4920 

ATAGAGGTTA GOAATTCCAA ACTACAOAAA ATGTXTGTTT TTAGTGTCAA ATTTTTAGCT 4990 

GTATTTGTAQ CAATTATCA6 QTTTGCTAGA AATAXAACTT TTAATACASI AGCCTGTAAA 5040 

TAAAACACTC TTOCATATOA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAOTAGA 5100 

AATAATCTGT XACTIATTGT AAATACTQCC CTASTCTCTC CATGSACCIVA ATTTASIATTT 5160 
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ATAATTGTAQ ATTTTTATAT TTTACTACTG AQTCAAGTTT TCTAGTTCIG TGTAATTGTT 
TAOTTTAATQ ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 
TOTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 
ATACCTTCAT TTTGAAAGAA GTTTTTATGA QAATAACACC TTACCAAACA TTGTTCAAAT 
GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA A 



5220 
5280 
5340 



PCT/US02/12476 



MRIUOtFIAC I 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



FKASKITFHW 
ILFEVGTEBN 
TDTVDWIVFK 
TOKEEIHEAV 
HEFLTDGYQD 
LIQTEEIIKB 



U3FKAIIDGV 



21 
I 

NANGYYBQQR 
KFQGWOKTSIi 
raSLEGQKFP 
ESVSWGKQA 
AVFCBVLTMQ 
DPENrrSLI.V 
NMSYVLQIVA 
AIVNPGKDSA 
MSTSQPVTKL 
NrVSITBYEE 
VLIPESARNA 
IRVDESEKTT 
VNWYSQITQ 
YWRKCFQTAH 
FETl>KEFYQE 
DYINANYVDG 
EVIVMITNIiV EKGRRiECDQY WFAOGSEEVG 
DMGVPEYSLP 
IFGFUCHIKS 
TKLEKQPQGL 
AALKQCNREK 



31 
I 

KLVEBIGWSY 
ENTFIRNTGK 
LEMQIYCFDA 
AU3PFILLNL 
QSGYVMLMDY 



GSKTVZ.RSPH 
iWISQGYIFS 
TAQPDVGSGR 
TEVTPHAFTP 



GKGDVFHTSL 
MNLSGTABSL 
SEHPETITYD 
BSPLQTNYTE 



SQKQRPSGRV VTQYHYTQWP 



GTYrVIiDSML 
UJSHIHAYVN 
SRVAQTILLS 
YIMGYYQSNE 
INCESPKVn, 
FELISVIKEE 
RFGVFADIEQ 



ESLIiTSFKU} 
SEDSTS5GSE 
KSFSAGPVMS 
PVYNEASNSS 
FYI.EDSTSPR 
VQSCTVDLGI 
YNRPKAYIAA 
KFLVTQKSVQ 
VLTFVRKAAY 
QKNYLVQTEE 



LPHSTDKYYI 
LQNNFREQQY 
TMIBKFAVLY 
SDQIilVDMPT 
ISTTTHYNRI 
QTVTEIiPPHT 



51 
1 

KKYPTOJSPK 
RVSGGVSEMV 
KGKGKLRALS 
YNOSLTSPPC 
KPSRQVFSSY 
QQLDGEDQTK 
DNPEIiOI>FPE 



GTKYNEAKTM 480 

VEGTSASUID 540 

PATSAIPFIS 600 

NVWFPSSTDI 660 



VISTPPTPIP 



QQK 

AIJ.IPGPAGK 
QSNIQQSDYS 
PIITCHPLLH 
MAEEHKCI.SN 
AAHHDGPMIV 
YOFIiYKVILS LVGTRQEBHP STSLDSNQAA 



NRTSSIIPVE 
DHNAQLWMI 
LEATQDDYVL 
TFCAIjTTLMH 



QGPUCSTAED 
VLAYYTVRKP 
AKBHAVGPW 
QYVFIHDTLV 
TISAHCin.PIi 
RSRVGISSIiS 
POGQHMAEDE 
EVRHFQCPKW 



LBSEKKAVIP 
FISDOVGAIP 
KHKHRYINIV 
PHRMIWEHNV 
TLRNTKIKKG 
VHCSAGVGRT 
EAIUSKETBV 
POliTDPPTSA 



1200 
1260 
1320 
1380 
1440 



Seq ID NO; 582 DNA sequence 
Nucleic Acid Accession # : NM_ 
Coding sequence: 148.. 7092 



CACACATACO 
CAAAAAAAAC 
OGGCGAGGGG 
CAGCTCCTCT 
CTTGTTQAAG 
AAATATCCAA 
CAAGIAAATG 
AACACATTCA 



AAATGCSkATA 
GAGATGCAAA 
G6AAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCXCAT 



ATTTCCTTCG 
COGCAGACCG 
GTGTTTGCCG 
AGATTGQCTG 
CATGTAATAO 
TGAATCTTAA 
TTCATAACAC 
OAGTTTCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
OGATTATTGA 
TCATACTCTT 
TOACATCrCC 



CTCCCCCTCC 



CCTGGATTGG 
GTCCTATACA 
CX:CAAAACAA 
GAAACTTAAA 



TGGATCAGAG 
TGATGCGGAC 
TTTATCCATT 
TGGAGTCGAA 
GAACCTTCTG 
TCCCTGCaCA 



TGGGAAAGAC 
CAGTTGGATG 
GGTGCTATTC 
TGCACTAATG 



AGQTGTTTTC 
CAGAAAATGT 
CTCGAGTCGT 



AAOCAAATCA 
ACGAAATACA 
AAGGGTGATG 
ACAGAAAAAG 
GAAGGTACTT 
AACTTGTCGG 
AGTTTATTGA 
GCAACTTCTQ 



TCAATAATTT 
GCTTATATQa 
TTGATCTTTT 
AAQACATC6A 



GGACTACTTA 
CTCATACACT 
TCRGGCTGAC 
TTATGATACC 
AACCAAGCAT 
GCTACOCAAT 



CCCT6AATTA 



AGCTTTCTCC 

CATTATTCTA 
TCCAGACAAC 
GTATACAATG 
ACCOCTTTGT 



TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAGA 
CCAGTTTCakA 
CTATCCCATT 
ASACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
QTQAGACACC 
TGCTTGACAA 



ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCXTAAAT 
GCTTGATACT 
CATCTCTGAO 
ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 
CACTGAGATA 
GATGTCACAG 
CTTCCCAACT 

ctccAcaarc 

TCTTCAAtXrr 
TCAOATCCrC 



31 41 

I I 

TCTATACACT GGAGGATTAA 
CTCTCCACTC TGAGAAGCAQ 
CGAATCCTAA AGCGTTTCCT 
GCTAATGGAT ACTACSVGACA 
GGAGCACTGA ATCAAAAAAA 
TCTCCTATCa ATATTQATGA 
TTTCAOGarT GGGATAAAAC 
GTCGAAATTA ATCTCACTAA 
AAA6CAAGCA AGATAACTTT 
CATAGTTTAG A AGGA CAAAA 
COATTTTCAA GTTTTGAGGA 
TTGTTTGAGG TTGGGACAGA 

agtgttagtc gttttgggaa 
ccaaactcaa ctgacsuvgta 

GACACAGTTG ACTGGATTCT 
GTTTTTTGTG AAGTTCTTAC 
CAAAACAATT TTCQAGAGCA 
GGAAAGGAAG AGATTCATGA 
CCAGAGAATT ATACCA6CCT 
ATGATTGAGA AGTTTGCAGT 
GAATTTTTGA CRGATGGCTA 
ATGAGTTATG TTCTTCAGAT 
GACCAACTGA TTGTCX3ACAT 
ATTGGAACTG AAGAAATAAT 
ATTGTGAATC CTGGTAGAGA 
TCTACCACAA C»CACTACAA 
TCCCCAACAA GAGGAAGTGA 
TCCaUTTTCCC AACCAOTCAC 
ACTGTGACTG AACTGCCACC 
TCTAAAACTO TTCTTAOATC 
ACAGTTTCTA TAACAOAATA 
GGAGCTGAAG ATTCTTCAGG 
AACATATCCC aVAGGGTATAT 
CTTATACCSVG AATCTGCTAG 
TCACTAAAGG ATCCTTCTAT 
GCACAGCCCX3 ATGTTGQATC 
CGTGTTGATG AATCTGAGAA 
GGTCCCTCavG TTACftGATCT 
GAGGTAACAC CTCRTGCTTT 
AACGTGGTAT ACTCGCftOAC 
TCCTACAGTA GTGAAGTCTT 
AACACTACCC CTOCTGCTTC 



51 

! 

AACAAACAAA 
AGGAGCCGCA 
CGCTTGCATT 
ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 
ATTTCX^CTT 



AGCAGTCAAA 660 
AGAAAATTTG "720 
GCAGGCTGCT 780 
TTACATTTAC 840 
TTTTAAAGAT 900 
AATGCAACAA 960 
ACA6TACAAQ 1020 
ABCAGTTTGT 1080 
TCTTGTTACA 
TTTQTACCAG 
TCAAGACTTG 
AGTA6CCAZA 
GCCTACIGAT 
CAAGGAGGAG 
CAGTGCTACA 



1200 
1260 
1320 
1380 
1440 



ATTCTCXGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 



CTCCAGTCCC 
ATTTTCCTCC 
AAATGCTTCC 
GGAGGGAAAT 
AGGCAGAGAG 
GACAACCAAG 



1740 
1800 
1860 
1920 

2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



410 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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TOGGCCTTGC ATGCTACGCC 

TcrrccTATa atggtgcacc 

TTTCGCCATC TGCATACACTT 
GRTAAGGTGC CCTTGCATGC 
AGCCTT6CTC AGTATTCTGA 
TTTGGTAGTG AATCTGGTGT 
AGCAGTGATG CCATGATGCA 
QATAATGAGO GCTCCCAACA 
GATTCTGTO3 GTOTAACTTA 
CXTTAAGTCTT CGTTAATAAC 
QGTGATGGGG AATGGTCTGG 
GGGCTGACAG CCCTTAACAT 
TCTGTGTTTG OTGATGATAA 
ACTGAACTGC AAATTCCTTC 
CCCAACATGT ATGATAATGT 
ATTTCTAGCA CCAAGGGCAT 
GATCATGAGA TTAGTCAAGT 
TCTCAAGCAT CTGGTGACAC 
TCCTCTGACC CTGCTTCTAG 
ACCTCAGCTT CTTTTAGTAC 
GACACCTTGC TTAAAACTGT 
CCCAAAGTTG ATAAAATTAO 
AGTGAAAACA TGCTGCACTC 
ATGCACTCTG CTTCACTTCA 
GTTTTGTTAA AAAGTGAAAQ 
TTGTTCCAAA OSGCCAATTT 
TTTGCTACAC CTGTTTTATC 
CATTCCGATG AAATTTTAAC 
ATTCCAACAG TTGCTTCTGA 
GGGCATGTTG CCATTACAGC 
TTG C T GiTT C CTTCTAAGGC 
tTAGTGGGTG GTGGTGAAGA 
GATAGTGATa QCTTATCCAX 



TGTATTTCCX: 
TTTGCTTCCA 
TTCTCAAATC 
TTCTCTGCCA 



TCTTTATAAA 
TGCACGTTCT 
CATCTTCACT 
TCAGGGTTCC 
CCCAACTGCA 
AGCCTCTTCT 
TTCTTCACCT 
TAAGGCX3Crr 
TTTCAATGAG 
AAATAA6TTG 
GTTTCCAGGa 
TCCAGAAAAT 
TTCGCTTAAA 
TQAAATGTTA 
TGAAGTATTG 
TCTTCCAGCT 
TTCTACAATG 
TACATCTGTA 



TTCCXSVCCAA 
GGAGATTAAC 
AATTGATGAA 
CTCCACCAAA 
TACATTTGTA 

AACTTCTGAG 
TGGTGACACT 
TCATAAGTGT 



AGTGTCQATG 
TTTTCCTCTG 
CTTCCACAAG 
GTGGCTGGGG 

ACX3CTTATGT 
TOUMGCCTG 
GTTTCTTACyV 
TTATTTAGCG 
TC3VTTACTGC 
GATAGTGAAT 
GTTTCTGTAG 
TCTAAAAOTG 
ATGGTTTACC 
AATGOSTCTT 
TOCETTGCTC 
AACTTTTCAO 
CCTGTGCTTA 
TCTCCTTCAA 
CTACAACCTT 
GTGCCCAGTG 
TTGCATCTCA 
CCAGTTTTTG 
ATTTCCTATG 
OTOGTACCTT 



TGTCATTTGA 
CTTCCTTCAG 
TTACTTCAGC 
GTGATTTGCT 
CTGCTTC31GA 
TTTCTCAAGT 

AACxrrrcTTA 

GTTCTGCAAT 
QCCCTAGCCA 
AGCCTACTCA 



ATCCATCCTG 
TAGTGAATTG 
TACCXJAGAGT 
ATTAGAGCCC 
GACGCTGGAA 
TGAACCACCC 



CTGAATTTAC 
AAATAATATA 
CTTCTGAAAG 
TACAAGAAAC 
ATACCAGCAC 
TTCAACCTAC 
GTGCAAACTC 
CTCAOCTCTT 
CCTTTCAGGC 
ATCCAATATT 
TTGTATCftAA 



TATACCAATA 
TGCCXrrCTCT 
TGACACAGAT 
ATATACAACA 
TQ6AAATGAQ 
CACAGTCATO 
CTCTQTTTCC 



ACATAGTCTC 



ATCTCRTACT 
QACAGTCAAA CTGGTATGGA 
CAAAAGCACA ATGATGGAAA 
CTCAGCCCTG TiATCTAAAGC 
CAAGGTACCT CAGATAGCCT 
ACTAATGAAA AAGATGCTGA 
TTCCGACAGT CCCCAACATC 
GAOGCASAGG CCAGTAATAG 



6AATTCTGAA 
CAGAAGTCCT 
AGAGGAAAAT 
ATGGGCAGTT 
TAATGAOAAT 
TGGGATCCTG 
ATCTGTTACT 
TAGC CATGAG 
ACCCCTTGTO 
OTGGTTCTTG TGGGTATTCT CATCTACtGG 
TTAGAGOACA QTACATCCCC 
TCAGATOATQ TOGGAGCAAT 
GCAAGTAGTG GGTTTACTGA 
AGCTGTACTG TTGACTTAGG 
AAGAATOGAT ACATAAATAT 
ATGGCAAACT 
CTTATATTGC 
AGAATBATAT GGOAACATAA TGXGGAAGTT 
AATOTGATCA 
AGAAGAGTGT 
AAATAAAAAA 
ACACGCAGTG 
OAAAGGCAGC 
GAGTTGGAAG 
AAGGAACTGT 
TACAAACTGA 
AAGAAACTGA 



CACAGAGATG 
CTGAGTCATA 
GAXGATGATG 
ATGTCATGCX 
GAAAACAGTC 



G6TAAATCAC 
GACATTCAGA 
CTGACAAGTG 
GAGACTTCCA 
GCAGCAGGTG 
AGCGAGAACT 



TCCSMkTAAAG 
AGAATTTGAG 
TATTACAGCA 
CGTTGCCTAT 
GACTGATTAT 



AG6AAATGCT 
TCCACACCTC 
CACTTTCCAA 
ACACTGAAAG 
GAOMSCTCCA 



CAAGTGAGAA 
CTTTOTACAG 
CCCCAAAAGG 
CACTAATAAA 
CTGGTAAGOT 
ATTCTGTTCC 
GTTCTGTAAC 
GTGCCAAATC 
GTGATGATGA 
CATCCTATAG 
TTATGGATCA 
GAOTCACAAQ 
CATCAGCAAA 
CTGGTAGTGC 
ATGAAGAAAG 
CAQATTTCAG 
ACTCAGAAAT 
CAQAASTGTT 
GTCTAOCTGA 
CCCTGACTTT 
TCCAGACTGC 
CAACACCTAT 
AGCATGTTGC 
AGTTTTACX31 
ACCACCCAGA 



ATTTTATGAG 
TTCTGATGTT 
GGTTGAAACC 
TTCTGCTTCA 
TACTTCTCAT 
ATATGAACCA 



AAGGCATGTA 
TAAGCTTATA 
ATTTGCTGGT 
TATAGGAAAT 
CTCAACAAAG 
TGATGCCGGT 
TGATGAC3«» 
AGAATCACAG 



TGGATCAGGG 



CCACGTTTCA 



TATCTGTCTA 



CTTTCCAATT 



CTGGTCACTC 



C3U3TATCACT 
ACCTTTGTGA 
TGCAGTGCTG 
ATTCAACAOG 
AATTATTTGQ 



TCAAATATAC 
OQAACTTCTT 
GAAGGCACAG 
ATCATTACCC 
CATAATGCCC 
GTTTACTGGC 
GCTGAAGAAC 
GAAGCTACAC 
AATCCAGATA 
GCCAATAGGG 
TTCTGTGCTC 
CAGGTAGCCA 
CAGTTTCTCT 
ACXrrCTCTGG 
TCTTTAGTTT 
CTAAAATTAG 
AQTAACTTTC 
TTTGCAAGAC 
TTCTAAGAAT 
ATAOAOGTTA 
GTATTTGTAG 
TAAAACACXC 
AATAATCTGT 
ATAATTGTAO 
TAGTTTAATG 



AGCAGAGTOA 
CTATCATCCC 
ACTACATCAA 
AGCACCCTCT 
AACTGGTGGT 
CAAATAAAGA 
ACAAATGTCT 
AGGATGATTA 
GCCCCATTAG 
ATGGGCCTAT 
TGACavACCCT 
AGATGATCAA 
ACAAAGTGAT 
ACaGTAATGG 
AACACAGAAA 
GCAG6AAAAT 
ATGACATAGG 
TTGTAATTTA 
QGAATTGTGO 
OGAATTCCAA 
CAATTATCAG 
TTCCATATGA 
TACTTATTGT 
ATTTTTATAT 



CTATGCCAAG 
AACAGGCACA 
CAACATATTT 
GGAGCAATAT 
GGTGCTGGAC 
CAAAACAAAG 
CTATTCTGCA 
TGTGGAAAGA 
TGCCTCCTAT 
CCTTCATACC 
TATGATTCCT 
TGAGCCTATA 



ATCRATGCCA 
CCACTGAAAT 
ATTGTCATGA 
GCC6ATGGGA 
QCCTATTATA 
AAAGGAAGAC 
GGAGTACCAG 
CGCCATGCAG 
TATATTGTGC 
GGCTTCTTAA 
GTCTTCATTC 



ATTATGTTGA 
CCACAGCTGA 
TAACAAACCT 



CTAGAGAAAC 
GCCCTAAAGC 
TCAAGGGTro 



CTGTGAGGAA 
CCAGTGGACX3 
AGTACTCCJCT 
TGGGGCCTGX 
TAGACAGTAT 
AACACATCCQ 
ATGATACACT 
ATGGCTATGT 
AATTCCAGCT 
AATGCAACAG 
GCATTTCATC 
ATTACCAGAG 



GGAAGTGCAG 
CAACAAGCAC 
AGCACAGCTT. 
TGGCTACAAC 
AGATTTCTGQ 
CMTGGAGAAA 
CX3GGAACTTT 
TTTTACTCTA 



2580 
2640 
2700 
2760 
3820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3S40 
3600 
3660 
3720 
3780 



4020 
4080 



4560 
4630 
4680 

4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 



PCTAJS02/12476 



GCCAGTGCTG 
TGTCGTCCAC 
GTTGCAGCAG 
TTCACAAAGA 
GGTTGAGGCC 
TAATGCACTC 
CCTGAGCCAG 
GGAAAAGAAT 
CCTQAGTGGA 
CAATGAATTC 



TGTACTTGAA 
TAAAACTTTT 
GATTGTTCAT 
TATGCACgAA 



ATCSW«3GATT 
GATGGCX»AA 
AATTGTGAGA 
OAAAAACTTA 
GTGAGGCACT 
GAACTTATAA 
GATGAGCATG 
CTAGAAAAAG 
CCAGGAGTCT 



ACATGGCAGA 
GCTTTAAGGT 
TAATTCAGGA 
TTCAGTGTCC 
GTGTTATAAA 
GAGGAGTGAC 
AAAATTCCGT 



TGCAGCATTG CCTGATGOAA ATATASCTGA C 
GACTCACATC 
CTGTTATCTG 
CAAATTTATA 
TTGAACTAAA 
TGTATTGATT 



CAGTCTAGTT 
ATTCTGCCGC 
CTTATTATGT 
TATTTTTTTC 
ACTACAGAAA 
GTTTGCTAOA 
TATTCAACAT 
AAATACTGCC 
TTTACCACTG 



AATATAACTT 



CTAGTGTCTC 
AGTCAAGTTT 
TTACTCTACC 



TTGATTTCCC 
TCATTAACAA 
ATGATTGAAT 
TTAACAGAAA 
TTAGTGTCAA 
TTAATACAGT 
CAGTATTCAC 
CATGGACCA& 
TCTAiQTTCTG 
AlSTTTTCTOA 



CACTCTTATG 
CTTTATCTTA 
TAAATGGCCA 
AGAAGAAGCT 
GGCAGGAACT 
GQATGTTTAC 
TQAGCAGTAT 
SAA.TCCATCC 

TTTCCTCTTC 
ATCACCTQAC 
TGTGTGCCTT 
TTTACAGTAT 
ATTTCAATTT 
ATTTTTAGCT 
AGCCTGTAAA 
CTAAASTAlSA 
ATTTATATTT 
TQTAATTGTT 



5460 
5520 
5580 
5640 
5700 

5830 
5880 
5940 
6000 
6060 
6120 
6180 



6420 
6480 
6540 
6600 
6660 
6720 



6960 
7020 
7080 
7140 
7200 
7260 
7320 

7440 
7500 
7560 
7620 
7680 
7740 



411 



wo 02/086443 

TGTTACCTAA QTCRTTAACT TTQTTTCAGC ATGTAATTTT AACTTTTGTG GAAMTAGAA 7800 

ATACCTTCRT TTTCAAAGAA GTTTTTATSA GAAXAACACC TTACCAAACA TTGTTCAAAT 7860 

OSTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 7920 
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31 



I 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



IQIjLCVCRliD WANGYYbQQR KLVEEIGWSY 
TQVNVNI.KKL KFQGWDKTSL EMTFIHNTGK 
GKCNMSSDGS EHSLEGQKFP I4EMQIYCFDA 
LDFKAIIDGV ESVSRFGKQA AjUDPFILLNIi 
DTVSISBSQIi AVFCEVIiTMO QSGYVMLMDY 



MRILKRFIiAC 
QSPINIDEDIi 
FKASKITFHW 
ILFEVGTEEaj 
TDTVDWIVFK 
TGKEEIHEAV 

HEPLTDGYQD LGAILNNIiIiP HMSYVIiQIVA ICTNGLYGKr S 

TOQIRKKBPQ I 

ATEKOISLTS C 
ESUTSFKLO 
SEDSTSSGSE 



41 

I 

TGALNQKNWG 
TVEINLTNDV 
DRFSSFEEAV 
LEHSTDKYyi 
UINNFREQQY 



KFSBQVFSSY 300 



<e QQUXSBDOTK 



LIGTBEIIKB 
RSPTRGSEFS 
GSKTVLRSPH 
KNISQGYIFS 
TAQPDVGSGR 
TBVTPHAFTP 
LNTTPAASSS 
ILFQVTSATB 



SliFSGPSHIP 
PVSVTIBFTYT 
LNASLQBTSV 
KPVLSANSEP 
AVPSDPII-VE 
TISYASEKYB 
BPUmUIKL 



GKGDVFNTSL HSTSQPVTKIi 
MMLSGTAESIi NTVSITEYEE 
SBNPETITYD VIIPESARNA 
ESFLQTNYTB IRVDBSEKTT 
SSRQQDLVST VHWYSQTTQ 
DSALBATPVF PSVDVSFESI 



GSIJVHTTTKV 
liSPSTQLLFY 
MUaiVSHSA 
QWPSIiYSND 



IPKSSLITPT 
TSVFGDDNXA 
SISSTBCGMFP 
ASSDFASSEM 
TPKVDKISST 
PVLIiKSESSH 
IHSDEILTST 



FVYNGETPLQ 
LSSYDGAPUi 
PSLAQYSDVL 



SSENMLKSTS 
EbE^TANIiEI 
GIPTVASOTF 



QGPSVTOIiEM 
PSYSSEVFPL 
PFSSASFSSB 
STTHAASETL 
TVSYSSAIFV 
SDSBFIiLPDT 
BMVYPSESTV 
NNFSVQPTHT 
LLQPSFQASD 
VFVPDVSPTS 
NQAHFPKiGRH 
VSTDH5VPI6 



CMSCSSYSBS QBKVMNDSDT BBSSlMDQtm PISYSIiS&HS BEENRVTSVS SDSQTGMDRS 
FI.SPBSXAHA VLTSDEESGS GQGTSDSIiNE 



PGKSPSANOIi 
METSTDFSFA 
ESRIGIiABGL 
ISTPPTPIFP 
ADSSNHPDNK 



LAYYTVBNFT 
KRHAVGFVW 
YVFIHDTLVB 
AAUCQCNRSK 



DTNEKDADGI 
ESBKXAVIPIi 
ISDDVGAIPI 
HKNRYINIVA 
WRMIWEHNVB 



NDIQT6SALL 
LAAGOSEITP 
VIVSAIiTFIC 
KHFPKHVADL 



VIVMITNLVE 
QiOSRPSaRW 
TYIVUISMLQ 



KRTSSIIPVE RSRVGI8SI.S 



HASSGFTEEF 
LAEKDGKI>TD 
KGHRKCBQYW 
TQYHYTQWPD 
QIQHEGTVNI 
LIiIPGFAGKT 
GE6TOYIHAS 



TSEHSEVFHV 
WRKCFQTAHF 
ETLKEFYQEV 
YINAKYVDGY 
PADGSEEYGN 



EBKIiIigOFI IiEATODDYVL EVRIIFQCFKH 
HDEHGGVTAO TPCALTTLMH QLEKENSVDV 
liVSTRQEENP STSLDSHGAA LPDGNIAESL 



Seq ID NO: 584 DNA sequence 

Nucleic Acid Accession B: NM_00S6a8.1 

Coding sequence: 126.. 4439 



I 

GGCTCATGCT 
QAATTCTGAT 
GGATATCGAC 
GAGAACCAQC 
ACOSrTGGAA 

tgcctccatg 
tcatggcttg 
caat6ciggg 
ccacaagaao 
tqaogtoaac 

AGACGCTGCr 
CATOSTGTGC 
ACACCTCTTQ 
GCTGGGCCTC 
GAATTACCGA 
CCrrAAGTTA 
QGAXGGGCAG 
TOTTGCCATC 



FGFIjKHIRSQ 
KLEKQFQIjLS 
YIKSYYQSIIE 
INCESFKVTL 
FBIiISVIKEB 
RPGVFADIBQ 



GTKYNBAKTN 480 

VEGTSASUID 540 

PATSAIPFIS 600 

NVWFPSSTDI 660 

PHYSTFAYFP 720 

VTPLIiIiDNQI 780 

I.FRaiiBTVSQ 840 



1020 
1080 
1140 
1200 

1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



HDSVGVTYQG 
DGLTALNISS 
MFNMYDNVNK 
VSQASGDTSL 
VDTLLKTVLP 



YLEDSTSPRV 
QSCTVDLGIT 
HRPKAYIAAQ 



LTFVRKAAYA 
RHYLVQTEEQ 
QSNIQQSDYS 
FIITQHPIiLH 
MABEHKCIiSN 
AANROGPMIV 
YOFbYKyiLS 



AGGGGOGCAO 
AGAAGATGAA 
GTGTGAGGGA 
GGAGAACTO? 
TCTCTCTTGA 
GAAAGTACCA 
ACCCRGTGGA 
CXX^TGTGGC 
ACGAGTCTTC 
AAGTTGGGCC 
TCATCCTGTC 
TCATGGTOAA 



TQOCAAOATa CCTTGQAAAC 



\ AGCCCATCOG 
r GTATOACTTT 
r CAATGGAAGA 
; TAGAGAOACT 



-TG. 
GCCCTGGAAC 
CCCCAGTCCT 
CCGTGAAGAT 
A6CAGCCCGA 
GGATGAGGAG 
GACTACTTCC 



CGTGTGGTCT 



TCCAAGTTCA 
GCCGAGGGCC 
CATCCCAAGG 
AAACACCAGC 
TCTTCTCTGG 
CTGTCCAAGC 



CTTGGGCATT 
TTAAGAAGAT 
TTTGCTCCAA 



CTCCTGACGG 
ACCGGTGTCC 
AAGAACATTA 



ATGAASTTCT 
AGA6TGTTCA 
AGGGTATCAC 
CTGTTCATAT 
TCTTCAATTC 
AAGCCTCAGT 



ATCAGCTGTT 
TTTCAGGAGA 
TACTTACATT 
AAAAATCCGC 
XGTGGGTGTG 
GACCCTGGGC 
CATGACTTTT 
GGCTGTTGAC 
ACCAGCCAOT 
CX»CTCCAOT 
TTCCABOGGC 



TTAGGCATGA 
TTTATCCTCT 
AAATGOGrOO 
AAATTTATCA 



AGGCAACAOA GTCTAACCTG C3«5TACAGCT 
AAATCGTGCG GTCTTGGTCX3 CTTGCACTGR 
GCTTGCGGGG GGCXXTCCTA ACCATGGCAT 
AAGAGAAATC CCTGGGTGAG CTCATCAACA 
AGGCAGCAGC OGTTGGCAGC CTGCTGGCTa 
TTTATAATGT AATTATTCIG GGACCAACM 



GCTCCCATTG 
TTCXJATCTGA 
GCTTTGAAAG 
AGATTTAAGA 
CXrrCACATCA 
ATCCAOAACT 



AAAT6TATGC 
GTCGOATATT 
TGGTGGTGAT 
CAGCAGCACA 
TAAC ACCOT T 
GTTTGTTTCT 
AGATAGAGAT 



TGAACX3TGTC 
CTGGGTCAAA 
GGAAAAAGCC 
TGCCAGCGTG 
GGCTTTC3VCA 
TTC3U3TAAAQ 
AATCGAAGAO 
GAAAAATGCC 
GACCCCCAAA 



CAC^UUIATGA 



GG6TACTTCC 
GTGACCTTCT 

gtgstgacag 
tccx:tctcaq 
qttcacatga 



ATOAAAAAAO 



1020 
1080 
1140 

1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 



412 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



WO 02/086443 

ACAGCATCX3A TCTGCSAOATC 
GTGGAAAAAC CTCTCTCATT 
TTGCAATCAG TGGAACCTTC 
TGftGAGACAA CATCCTGTTT 
ACAGCTGCTG CCTGAGGCCT 
GAQAGCGAGG AOCCAACCTG 
TGTATAGTGA CAGGAGCATC 
TGGGCAACCA CATCTTCAAT 
TTGTTACCCA CCAGTTACAG 
GCTGTATTAC GGAAAGAGGC 
CCATTTTTAA TAACCTGTTG 
AAACCAGTGG TTCACAGAAG 
A6GAAAAA6C A6TAAAGCCA 
GTTCAGTGCC CTGGTCAGTA 
TCCTGGTTAT TATGGCCCTT 
GGTTGAGTTA CTGOATCAAG 
CCrCGGTGAG TGACAGCATG 
CCCTCTCCAT GGCaGTCATG 
GCaCGCTGCG AGCTTCCTCC 
CTATGAAGTT TTTTGACACG 
TGGATGAAGT TGACGTGCGG 
TGGTGTTCTT CTGTGTGGGA 
GGCCCCTTGT CATCCTCTTT 
TGAAGOGTCT GGACAATATC 
AQGOCCTTGC CACCATOCAC 
AGCTGCTSGA rOACAAOCAA 
CTGTGOGGCT GOACCTCATC 
TTATGCaCGG GCAGATTCCC 
TAACGGGGCT GTTCCAGTTT 
CX3GTGGAQAG GATCAATCAC 
AQAACAAGGC TCCCTCCCCT 
AGATGAGQTA CCGAGAAAAC 
CTAAAGAGAA GATTGGCATT 
CCCTCTTCCX3 TCTGGTGGAG 
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CAAGAGGGTA 
TCAGCCATTT 
GCTTATGTGO 
GGGAAGGAAT 
GACCTGGCCA 



AACTGGTTGG 
TAGGCCAGAT 
CCCAGCAGGC 
ATGATGAAGA 
TTCTTCCCAG 
AGCGCCAGAG 
ACGACCCCCT 
GGAAACATCT 
ACTGTGATGA 
AACTGATGAA 
CACOGCCAGT 
ACAAGGQTCC 
AGCTTOTGCA 
ACATCCAGGC 
TTCMTjCTGA ATGXAOaCAG 



TACATCCTGG 
AGTGCTATCC 
TACCTGGTro 
ACXXa^TGAGG 



CAOraCCTTJ 
CAASTCCAAO 
AGTGATCTTC 
TTTAAATGGT 



AQTGTGGGAA 
GAGGGCAGCA 
AATGCTACTC 
TCTGTGCTGA 
AOGGAGATTG 
GCCOGGGCCT 



ACAGTTCT6T 
ATGAAAGAGG 
GACTATGCTA 



TAAAACAGGA TCAQTAAAGA 
GCTGGAAGAO AAAOGGCAGG 
TGCTGGGGGC CCCTTQGCAT 
CAOOGOCTTC ASCACCTGGT 



AAGGACAATC 
CTGATCCTGA 

CGGCTGCATG 
ACCCCCACAG 
CTGCCaTTCC 
ATGATCGCAG 
TCAGTCCTGC 



CTCRTATGCA 
AAGCCATTOO 
ACCAGCTTTT 
GGAGGATTCT 



GCCT ACAATA 
QCTCCTTTTT 
AGCATCGCCC 
CCAGCCTATG 
ACGGTCAGAC 



QAGTCTTCCC 
ACRTTGTCTC 
CTTTCCTCTC 



CTCCCTCTTG 
GTGGGGCGGA 
TTATCTGGAG 



TTTTGTTTAC 
TCATCACCAC 
CGGGTCTCGC 
TGGCATCTGA 
CTCTQTCCTT 
AGGAGGGAGA 
TCCTAAAGAA 
CAGQATCAGQ 



GTACTATOCC 
AGGAGTTGTC 
CCGAAGGATC 
CAACAGGTTT 
GTTCATCCAG 
GTGGTTCCTT 
CAGGGTCCTG 
CCACATCACG 
GTTTCTGCAC 



AGCATCTACG 
TTTGTCAAGG 
CTTCGAAGCC 
TCCAAAGACA 
AACGTTATCC 
GTGGCAGTGG 
ATTCGGGAGC 
TCCaCCRXAC 
AGATACCAiQO 



CACGGGGCTG 
CATCTCTTAT 
GACAGAAGCT 
GOAAGCACCT 
GGTGACCTTT 
AOTATCXn-TC 
6AAOTCCTCQ 



ATGATCGTTC 
GCTGTCCAGT 
CX3ATTCACCT 
GCCAGAATTA 
GAGAAOSCAG 
ACGATCAAAC 



T6TTCAGTGG 
TTTGGGATGC 
TTGAATCTGA 
GCATAGCTAG 
CCATGGACAC 
OTACCATGCT 
TGCTGGCCCA 
GTTCCCQATT 



CCCCTCATCQ 
GTTCCGOATT 
ATTCCATATT 
GGGAACCGTT 
TCTATATATA 
7ATTAAAATA 
TTGCTGTACT 



CCTGGAGAGG 
AGTGATGGAG 
AGCCCTGCTC 
AGAGACAGAC 
GACCATTOCC 
QGGACAQGTG 
CTATGCO^TG 
TGACGAAGTC 
CGTCCTCCTA 
GGCTTGTGTG 
CATGTAAACA 
ATTATAATTG 
ATTCTGTACA 
AGCACTGTGC 



TCAAATTTGG 
ACACACATGA 
AATGGGGATA 
CGCCACTGTA 
TTATTGATTC 
CATCGCCTGC 



GCTGCATCAA GATTGATCeA GTGAGAATCA 
AACTCTCTAT CATTCCTCAA 
ACXXCTTCAA CCAGTACACT 
TGCTCAGCTA 



2280 
2340 
2400 
24fi0 
2520 
2580 
2640 
2700 



3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 

3720 
3780 
3840 
3900 
3960 



TTTGCTQCTG 

CCGAA ACCT T 
TTTCACTTTT 
AAATTTAGTT 
TATCAGAGGC 
TAGCCTATAT 
TAATAACAGT 
TTTTCCTATT 
GTGCXaGGTT 
ATAQTOOGCC CTCCGACAGC CCCCTCTGCC 
GACCATGCAG 
GTTTCTGTCA 
ATGGGGATCA 
GCTGTTGTTT 
ATGGCTGGCC 
CCAACTGCTG 
CAGTGGCAGG 
QTCTCTCTCT 
CTCACACTCG 



ACTTCTCAGT 
AGATTCTGAT 
AAGAGACCAX 
ACACG6TTCT 
ACACCCCATC 
CAGAOAACAA 



CCGAGAAGCA 
AGGCTCCGAT 
G6TCCTTCTG 
GGTCGCTOTC 



CCTCTGAAAC 
CAGCTCTTGT 
GCCACAGCTG 
TTTGCAGACT 
AGGATTATGG 



AAGGGCIGAC 



TTACAGTGAA 



AGACTGTAGG 



CTGTCXrrGGT GTCACTTACT 
TTTCACTCCC TCCATCAAGA 
TTTCCTGCCT TCTTCTTTTT 
TCtXSVCTGCC TCAOGTTCCT 
GTTGGTTCCA AGCCCTGGAG 
ATTCCXSVCAC CTCCRCAGTT 
CTCACOGCAG TCGTCGCACA 



GGGGCTGOTA GCTCAGGTGG GCSTGGTCAC 
AXGTOGIGAC CAACTAGACA TTCIGTOGCS: 
CAAAAATCXG AAAATGTQAA TAAAATTATT 



GCCTCCCCAC 
AGCGCOGTGA 
GGAGAGCAGC 
OWSAGACATT 
CTAAACAAGA 
ACTGCA CAGA 
CTTTTTGAG6 
GCTCAGGATT 
CTCTCTCCCC 
CGTAGAAGTT 
GTOTGTTCCC 



ATTTTATCTT 
ATATTTTQAT 
ATTGCACTCT 
GCTTTATACG 
AATGTAAGCT 
TTCTATCATT 
AAGAGTAGCA 
CCAAAGGAAG 
AGCXXSCTCCA 
GTTCrCAGGG 
GGGGCGAAGC 



CTCCTGCCTT 
CCAGGCCCCT 
GGGGAGTTTC 



: TCCAAGACCT 



TCGTGGGTCT 
TCAAAGTCTG 
TTTGTACTGT 
6CAAACCCCC 
AGTTGAATGG 
T6CTGAACAC 



4260 
4320 
4380 



TCGCACAGCA 
TATTGTATTT 
AAAAGGTTCA 
TGTAGCTATA 
GTTTATTTTA 
TTTGTACAGT 
TTTCATTCTT 
ACGTGTGGCA 



GTTTTCCTTT 
CAACTTTAAG 
AAAGAGACCT 
TTTGTGCTGT 
TCAGGGTTGC 
CTTGIGQAAG 



4980 
5040 
SlOO 

5220 
5280 
5340 
5400 



5640 
5700 
5760 
5820 



70 
75 
80 



MKDIDIGKEY I 



VAHKKGELSM 
LSIVCIiMITQ 
AUJYRTGVRL 
PWAIIiGMIY 
VLTVIKPIKM 
HMTLGFDLTA 
HKPASFHIKI 
VLAEQKGKLL 
KTSIiISAIIiG 
CCLRPDIAIL 
MaiFNSAIRK 



LAGPSGPAFM 
RGAILTMAPK 
NVIILGPTGP 
YAWVKAPSQS 
AQAFTWTVP 



SSDVNCaiRIjE 
VKHLIiEYTQA 
KILKLKHIKE 
LGSAVFILPY 
VQKIREEERH 
HSMTFAUWT 



LOSOERPSPB 
QMTLLEQ3IA 
PSSDLTEIGB 
HLKSKTVLFV 



RIA<QBELNEV 
TESNLQYSUi 
KSIiGELINIC 
PAMMFASRLT 
ILEKA6YPQG 
PFeVKSMEA 
lOiTPKMKXDK 



GPDAASLRRV VMIFCRTRLI 
LVLGIiLLTEI VRSWSLAI.TW 
SHDGQRMFEA AAVGSUAGG 
AYFRRKCVAA TDERVQKMNE 
ITVGVAPIW VIASWTPSV 



ISGTPAYVAQ 
RGAHIiSGGQR 
THQLQYLVDC 



SDRSIYILDD FIiSALDAHVa 
ITERGTHEBL MHtNGDYATI 
KAVKPEBGQI. VQI.EBXGQGS 
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wo 02/086443 PCT/US02/12476 

VPWSVYGVYI QAAGGPUVFL VlMALFMtNV GSTAPSTWWL SYWIKQGS3N TTVTRGNETS 900 

VSDSMKDNPH MQYYASIVAL SMAVMLILKA 1RGWF7KGT LRASSRLHDE LFRRILRSPM 960 

KFFDTTPTGR ILNRFSKDMD EVDVRLPPQA EMFIQNVILV PFCVGMIAGV FPWFLVAVGP 1020 

LVILFSVLHI VSRVLIRELK RliDNITQSPF LSHITSSIQG lATIKAnJKG QEFLHRYQEL 1080 

LDDNQAPFFL PTCAMHWIAV RUJLISIALI TTTGLMIVLM HGQIPPAYAG LAISYAVQLT 1140 

GLFQFTVRLA SETEARPTSV BRINHYIKTL SI.EAPARIKK XAPSPDWPQB GEVTFEHAEN 1200 

RntENiPLVL KKVSF7IKPK EKIQIVGRTQ SGXSSLOnL FRLVEI.SGGC IKIDGVRISD 1260 

IGLAOUlSKIi SIIPQEFVLF SGTVRSNLDF FNQYTBDQIH DAI.ERTHHICB CIAQLPLXLE 1330 

SEVMEtKSDNF SVGERQLU:i ARALLSHCaCI LIUDBATAAM OTBTDLLIQB nitSAFAOCT 1380 
MLTIAHHLHT VLGSDRIMVL AQGQWEFDT PSVUSNOSS RFXAMPAAAE HKVAVKG 

Seq ID NO: 586 DNA sequence 

Nucleic Acid Acceasion »: NM_0013a7.1 

Coding sequence : 89 . . 631 

1 11 21 31 41 51 

AGCAGGGGGC GCTGTGTGTA CCGAGAATAC GAGAATACCT CGTGGGCCCT GACCTTCTCT 60 

CTGAGAGCCG GGCAGAGGCT COGGAGCCAT GCAGGCCGAA GGCXXJGGGCA CAGGGGGTTC 120 

GACGGGCGAT GCTGATGGCC CAGGAGGCCC TGGCATTCX:? GATOGCCCAG GGGGCAATGC 180 

TGGCGGCCCA GGAGAGGCGG GTGCCAOGGG OKSCAGAGGT CCCOSGGGCQ CAGGGGCAGC 240 

AAGGGCCTCG GGGCCGGGAG GAGGCGCCCC GCGGGGTCCG CATGGCX3GCG CGGCTTCAGG 300 

GCTGAATGGA TGCTGCa«3AT GCGGGGCCAG GGGGCCGGAG AGCOGCCTQC TTGAGTTCTA 360 

CCTCX3CCATG CCTTTCGCGA CACCCATGOA AGCAQAGCTG GCCCGCAGGA QCCTGGCCCA 420 

GGATOCCCCA CCGCTTCOOG TGCCAGGGGT GCTTCTGAAQ GAGTTCACTQ TOTOCGGCAA 480 

CATACTGACr ATCOQACTCA CTGCTGCftGA COCXXSCCM. CIGCAGCTCT CCRT CAOCTC 540 

CTQTCTCCAG CAGCTTTCCC TGTTGATGTG GATCACGCftG TGCrTTCrOC CCSTQTTTTT 600 

GGCTCSMSCCT CCCTCAGGGC AGAGGCXKTTA AGCOCAGOCT aaOBCCCCTr CCTAGGTCAT 660 

GCCTCCTCCC CTAGGGAATO GTCCCAGCAC GAGTQGCCAa TTCATTOTOO GGQCCTCATT 720 
QTTrGTCQCT GGAGGAGGAC GGCTTACATG ITTSTTTCTG TAGAAAATAA AACTGAGCTA 



11 I I 

MQAEGRGTGG STGnADGPGG PGIPDGPGCTI AGGPOBAQftT G 
PRGFHGGAAS GLNGCCRCGA RGPBSRIXEF YLAMPFATPM EAELARRSLA QDAPPLPVPG 
VLLKEFTVSG NILTIRLTAA DHRQLQIiSIS SCLQQI<SUM WITQCPLPVP LAQPPSGQRR 



45 1 11 21 31 41 51 

1 I I I I I 

CCTCGTGGGC CCTGACCTTC TCTCTGAGAQ CCGGGCSUSAG GCTCCGGAGC CATGCAGGCC 
GAAGGCCAGG GCACAGGGGG TTCQACGGGC GATGCTGATO GCCCAGGAGG CCCTGGCATr 
CCIOATGGCC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG CGGGTGCCAC GGGCGGCAGA 

50 GGTCCCC3GGG GCGCa«3GGQC AGCAAGGGCC TCX3GGGCCGA GAGGAGGCGC CCCGCX3GGGT 
CCGCATGGCO GTGCCGCTTC TGOGCAGGAT GGAAGGTGCC CCTGCGGGGC CAGGAGGCCX3 
GACSkGCOGCC TGCTTCASTT CCGACTGACT GCTGCAGACC ACOGCX»ACT GCAGCTCTCC 
ATCAGCTCCT GTCTCCACCA GCTTTCCXrrG TTGATGTGGA TCACGCAGTG CTTTCTGCCC 
GTGTTTTTGQ CTCAGGCTCC CTCAGGGCAG AGGOGCTAAG CCCAGCCTGG CGCCCCTTCC 

55 TAGGTCATGC CTCCTCCCCT AGGGAATGGT CCCAGCaiOQA GTGGCCAGTT CATTQTGGGQ 
GCCTGATTGT TTGTCGCTGG AGGAGGACGG CTTACATQTT TGTTTCTGTA GAAAATAAAG 
CTSAGCTA 

60 

1 11 31 31 41 SI 

I 1 I 1 I i 

, _ KQAEGQGTGG STGDADGPGG PGIPDGPGGN AGGPKKAGAT GGRGPRGAGA AHASGPRGGA 
65 PRGPHGGAAS AQDGRCPCGA RRPOSRUAF RLTAADHRQI. QI.SISSCIAQ bSUiMNITQC 

PLPVFLAQAP SGQRR 

Seq ID NO< 590 DNA aequence 
Nucleic Acid Accession St NM_005562.1 
70 Coding sequence: 90.. 3671 

1 11 21 31 41 51 

i I I I I I 

ACAGCGGAGC GCAGAGTGAa AACCACCAAC OGAGaCaCCG GGCAGOQACC CCTaCAGCGG 

75 AOACAQAGAC TGAGCGGCCC GGCAOCGCCA TGCCTGCGCT CTOGCIGGGC TGCTGCCTCT 
GCTTCTCGCT CCTCCTGCCC GCAGCCOQGQ CCACCTCCAG GAGGGAAGTC TGTOATT6CA 
ATGGGAAGTC CAGGCAGTGT ATCTTT6ATC GGGAACTTCA CAGACAAACT GGTAATGQAT 
TCCGCTGCCT CRACTGCAAT GACAACACTG ATGGCATTCA CTGOGAGAAG TGCAAGAATG 
GCTTTTACCQ GCACAGAGAA AGGGACCGCT GTTTGCCCTG CAATTGTAAC TCCAAAGGTT 

80 CTCTTAGTGC TCGATGTGAC AACTCTGGAC GGTGCAGCTQ TAAACCAGGT GTGACAGGAG 
CCAGATGCGA CCGATGTCTG CCAGGCTTCC ACATGCTCAC GGATGOGGGG TGCACCCAAG 
ACCAOAGACT GCTAGACTCC AAGTGTGACT GTGACCCAGC TGGCATCGCA GGGCCCTGTG 
ACGGGGGCCa CTGTGTCTGC AAGCCAGCTG TTACTGOAflA AOGCTGTGAT AGGTGTCGAT 
CAGGTTACTA TAATCTGGAT GGaOGGAAGC CTOAaGSCTG TACCCAfilQT TTCTGCTATG 

85 GOCATTCAGC CAGCTGCCGC AGCTCTGCAQ AATACAGTGT CCATAAGMC ACCTCTACCT 
TTCATCAAGA TGTTQATGGC TGGAAOGCTG TCCAACGAAA TCQOTCTOCT aCAAAGCTCC 
AATGGTCACA GCQCCATCAA GRTGTGTTTA GCTCAGCCCA ACGACTAOAC CCTQXCTATT 
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TTGTGGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACOG TGTGGACAGA GGAGGCAGAC ACCX:ATCTGC CCATGATCTG ATTCTGGAAG 960 

GTGCTGGTCT AC3GGATCACA GCTCCCTTGA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

tcaccaagac ttacacattc aggttaaatg agcatccaag caataattgg agcccccagc loao 

TGAGTTACTT TGAGTATCGA AGGTTACTGC GGAATCTCAC AGCX:CTCCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTQ ACAATQTOAC CCTGATTTCA GCCCGCCCTG 1200 

TCTCTGGAC5C CCTAGCACCC TGGGTTGAAC AGTOTATATG TCCTGTTGCKS TACA ASGGG C 1260 

AATTCTGCCA GGATTGTGCT TCTGGCTACA ASAGAGATTC AGOQA6ACTG GGGCCTTTTO 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTCTGCTGA CTQCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCAGCTGCT 1500 

CAGTGATGCC GGAGACXSGAG GAGGTGGTGT GCAATAACTG CCCTCCCXK3Q GTCACCXX3TG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTGT CAATQCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCGGC ATCTACTGCG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTOGCTCC CAACCCAQCA QACAABTQTC 1800 

GAGCTTGCAA CTCTAACCCC ATGGGCTCAO A6CCTOTAGQ ATGTOBAAGT GATGGOVCCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGT6AOCA TGGAGCATTC AGCTQTCCAG 1920 

CTTGCTATAA TCAAGTGAAQ ATTCAQATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGO 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2040 

aCAGGATQCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 2100 

AAGGTGCTAQ CAGATCCCTT GGTCTOCAGT TQGOCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

AOCAGAGCCG CCKSQATGAC CTCAAGATOA CTGTGCAAAG ASTTCGGGCT CTGGGAAGTC 2220 

AGTACCAGAA COOAGTTCQa GATACTCACA GGCTCATCAC TCAGATGCAG CTCAGOCTGG 2280 

CAGAAAGTGA AGCTTCCrTG GGAAACACTA ACATTCCTGC CTCAGACCftC TACGTGGGGC 2340 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCSiQAAAGC CACGTTGAGT 2400 

CAGCCAGTAA CATGOAGCAA CTGAC3UKK3G AAACTGAGGA CTATTCC3UUI CAAGCCCTCT 2460 

CACTGGTGCG CAAGGCCCTG CATGAAGGAG TCGGAAGOGG AAGOMTAGC CCGGACXSGTCi 2520 

CTOTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACCAA 6TCCCTGGCC CAGCAGTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 2640 

TCCTGGATTC AGTCTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTOOAAGAAG 2700 

CAAAGAOQAT CAAACAAAAA GCGGATTCAC TCTCAAOSCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGOG TACACAAAAO AATCTQGGAA ACTGOAAAOA AGAAGCACAG CAGCTCTTAC 2820 

AGAATGOAAA AAGIGGQAGA OAlGAAATCAG ATCAGCTQCT TTCCOGTGCC AATCTTGCTA 2880 

AAAQCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGRA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCX3GGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACT GGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACOAATA 3300 

TGQATGCAGT ACAGATGGTG ATTACAGAAG CKCAGAAGGT TOATACCAGA GCCAASAACG 3360 

CTGGGGTTAC AATCCAAOAC ACACTCAACA CATTAGAOGQ CCTCCTGCAT CTGATGGACC 3420 

AGCCrCTCAG TGIAGATGAA GAGGtoCTOG TCTTACIGGA GCAGAAGCTT TCCXSAGCCav 3480 

AGACCCAQAT CAACAGCCAA CTGCGGCCCA TGATGTCAiSA GCIGGAAGAG AGGGCACOTC 3540 

AGC3VGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTQ GCTQATGTOA 3600 

AGAACTTGGA GAACATTAGG GACAACOTOC CCCCaVGQCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATOTCATG TGAGTGGGTG GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATQCTCAGG TCAACTGACC TGACCCCATT CCTGATCXX3^ TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AOATAGCACT GGGTGTGAGA 3 900 

ATGATCAAGG ATCTQGACCC CAAAQAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CXTAATAGTC GTAAGTGGAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 4020 

ATAOTCAACT TATTCTTTGA QTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCr CMATTGTCC TCTGCAAGCT TCTTGCTGAT CAOAGTTCCT CCTACTTACA 4200 

ACXICAGGGTG TGAACATGTT CTCCATTTTC AAGCTG GAAG AAQTOAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTO CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGAtGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GCAAATCTTG GGAAAGTATT TACTTTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACX»AAAAT GATGCX3CATC AATGrATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCrACT 4620 

CACACTTCAG CTGGGTCaCA TCCATCCCTC CATTCaTtXTT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CXyVTCCTTCC AACATATATT TATTGAGTAC CTACTGTGIG CCAGGGGCI6 4740 

GTGGGACAGT GGTGACATAG TCTCTGCCCT CATAOAGTTG ATTQTCTAOT GAGGAAGACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTQTTTATT 4860 

GCAATAACCG CTTQOTTTGC AACCTCTTTG CTCAACAGAA C3VTAT«STTOC AASAC CCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT G6QTTGT6CA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACT6ATT6C AACAGACTGT TGAGTTATGA S040 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CAOCTTGGCT GGGAAGACTA SlOO 

TGGTOCTGCX: TTGCTTCTOT AriTCC TT O O ATTTTOCTGA AAflTGTTTTT AAATAAAGAA 5160 
CAATrGTTAO ATGCC 



1 11 21 31 41 51 

I I I 1 I I 

MPAWJLGCCii CFSr.I.LPAAR ATSHREVCDC NGKSRQCIFD RELHRQTSIG FRCLMC21DHT 
DGIHCEKCKN GFYRHRERDR CIiPCNCNSKO SLSARCDNSG RCSCKPGVTG AROJRCLPGP 
HNLTDAGCTQ DQRI«IJ>SKCD CDPAGIAGFC DAGRCVtaiPA VTGERCDRCai SGYYHLDGffll 
PBSCTQCFCY GHSASCRSSA EYSVHKITST FIK20VDQWKA VQRNaSPAKL QWSQBHQDVP 
SSAQHMIPVY FVAPAKFIiCW QQV8YGQSI.S FDTOVDKGGR HPSAKDVILE GAGLRITAPL 
MPI<GKn.PCG LTKTYTElUiM EHPSIOnfSPQ LSYFEYRRIJi RNIiTAUlIRA TfGBnfSTGVI 
DNVTLISARP VSGAPAPHVE QCICPVCYKG QFOODCASGY KROSARLGPF GTCIPOIGQG 
GGACDPDTGD CYSGDEHPDI ECADCPIGFY NOraDPRSCK PCPCUKQFSC SWMPETEEW 
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CNNCPPGVTG AHCELCADGY 
2 DQCKAGYFGD 
? ACYNQVKIQM 
QDIIiROAQIS EXlASRSIiGLQ 
RIiITQMQLSL AESEASW3NT 
ETEDYSXQAL SLVRKALKEQ 
ADRSYQHSI.R IXDSVSRLQG 
NWKEEAQQLL QNGKSGREKS 
QVOmiKAEAB BAMKRXiSYIS 
IEQBIGSZ<NL EANVTADGAL 
AQKVDTRAIQI AGVTIQOTUI 
MMSELBERAR QQRGKUIliLB 



Seq ID NOt 592 um. sequence 

Nucleic Acid Accession U> AP101051.1 

Coding sequences 221.856 



PCT/US02/12476 



VSDQSFQVEE 
DQIiLSRANLA. 
QKVSDRSDKT 



VRPCQPCQCN 
RACNCMPMGS 
EALISKAQGG 
YQSRLDDIiKM 
FNGFKSLAQB 
AWQQLVEKL 
AKRIKQKADS 
KSRAQEALSM 



NNVDPSASGN 
EPVGCRSDGT 
DGWPDTELE 



CDRLTGRCLK 
CVCKPGFGGP 
GRMQQAEQAI. 
QYQNRVRDTH 
SASNMEQbTR 
TREATQAEIE 
EFKRTQKtlliQ 



t. PPGCYNTQAI. BQQ 



ACCTGCCACC 
GCTGTTGGGC 

acx:ccAQTGa 

CGAGGGGCTG 
TGACTCCTTG 
CATCCTCCTG 
CTTGGAAGAC 



ATTCTATGAC 



GAAnSACTAC 
GGACATTGAO 
GTATGGTATT 
AAACRTGGCT 
TTGTATTACT 
TATATATAGA 
CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 
TTATTTTTTA 
TTTCATTGGT 
AGCCAAGAAG 
GTGATAAATT 
TTTGCTTTCA 
CaCAACTTTA 
ACCTTTTTCjT 
TATATCTTCC 
GATAATCTGG 



AATATTAATT 
TTTATTTQCT 
CTTCATGTGA 
ACSVCATACCT 
AAACCTACGC 
ATTCTTTCAG 
TTTCCAGTCT 



GCCTTAACCA 
AAGATTCTGA 
ACAGATGTAA 
TTTTGAATCA 
TCTTAGCTGO 
CTACACAAGG 
TOCCTTCCAA 
ATACATAGAT 
ACAAAAAAAT 
TTTGATCTTT 



11 
I 

AGCTTCTAGT 
CTTCTCCAGC 
GCCACCTTCO 
CCTGAGCCAQ 
TTCATTCTCG 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 
GOAGTGATAG 
GATGAGGTGC 
CTGGCTATTT 
CCTATGACCC 
GCTOCTTCTC 
ACCTCTTACC 
OTGTOACACA 
ATACTATCAT 
ACAAAAC3U\A 
TAATCTTATT 
GCTTCCCATT 
TATGTATATA 
TGATACTAGC 
GAABATGTTT 
TCATTTACTC 
AAGOATQAAT 
CCATAATCTT 
CTCTATCTCC 
AATTTATTAC 
CCTGTTGACC 
AAATATTTGT 
TTGATTCAAT 
TCCCCATTCC 
TAATAAGQTG 
IGACAAATAT 
ATCTGCCftAA 
AGTTTATATT 
CAGCTGGCTG 



ATCCAGACTC CAGCX3CX:GCC COSGGCGOGG 
CGAGCAGGGC TCCCCGCCTT 
TTGCCCACCT GCAAACTCTC 
CGAQOIAQTC ATGGCCAACG 
ATGOATOGGC GCCATCGTCA 
CCTATGCCGG CGACAACATC GTGACCGCCC 
GCGTGTCGCA GAGCACCGGO CAGATCCAGT 



: AGCATGGTAT 
2 CAGGTACGAA 
r GGGAGGTGCC 



G6CATGAAGT 
ATTGGGGOTG 
GGCAATA6AA 
TTTGGTCAGG 
CTACTTTGCT 



51 
1 

ACCCCAACCC 
AACTTCCTCC 
CGCCTTCTGC 
CX3GGGCTGC3V 
GC3VCTOCCCT 
AGGCCATGTA 
GCAAAOTCTT 
TGGTGGTTGG 
GTATGAAGTG 
CGATATTTCT 



TCGTTCAAGA 660 



TAACATTAGG 
CAAACAAACA 
TTATCTTCTT 
GAGTAATCAT 
TACATQTTTT 
ATACTTAAAA 
ATTGGTATAT 
TTCTTCATTA 



OAGAAAATCA 
ACCTTAGAAT 
AAAAACCCAT 
TCCTCAATAT 
ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 



TOTTQAAACA AACCGAAAAT 900 



GTGTTAAAAT 



GCTTTGGGTQ 



ATAGACAGTA 
ATAGGTAAAT 
GTCCTTATAT 
CCTTTGCCAC 



GTAATCTGAA 
ACTCAGTGCT 
ATTTTACCAT 
GCTCCTT 



ATAGCACTTG 
TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTGAGT 
TTTTAAGCTA 
TTAATTGTAT 

TCTCTCTQTA 



ACTCTCATTC 



AAAAGGAAAA 



TCATGTGGTT 
ACATACCTTC 
CTGTGTCTGA 
GTACAGAATG 
CTGGAGACCT 
TTGGCTGCTG 
CACCTCACAG 
AATTTGAAAA 
TTGCTTTTCA 
GTCTCTCAAG 
GGAAGTCTTA 
TGGGAAGAAA 
TAATAACTCA 
CaWXTTGACGC 
AAAGTCAOCC 
ACXTTGAGAAT 
CTTCATGATG 
TTTATGGCCC 
TTATATTCTT 
AATTTGTATA 
AAAAAAAAAA 



TCCTCTCTCT 
CAGTGCCTTC 
ATGTGGCTCA 
CATGTTTGTG 
CTATTTCACT 
GGATTTGAGT 
TAAGCTTATT 
TGATGTTGTG 
GTGCTATACT 
AATGTTTGAA 
TGATGAQACA 
TCTTCTGCAG 
TAAAAGCCTA 
TAAGGTGCTA 
TGCTAGGATA 
ACCX5TCTCTT 
ATATGCTTTT 
TGTGAGTGTA 
AAAATGACCA 
CTACCACACC 
AAGCATTACT 
AAA 



CATCGTTATT 
ACATTTCATA 
TTTGGAGGCA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTGAACAA 
GCTGTAAGCA 
GATAiCTTAAC 
TTTGAACATG 
GAAGTCACTG 
ACX3VGTCTAT 



AAOCCCTTAT 
GCCTACATTT 
AATCTTTCTG 
TCTGACCC3VT 
TGTTCCCCCA 
GTTTTATATC 
AGTGTAATTA 
AGTGCTAOAC 
AGTCavCTTAA 
CAGTTAGAAG 



AAATACTATT 
GTATTTAATT 
ACATATGTAA 
AAGACCTAGC 
TATACTTATT 



CATGACCAAA 
AGCACTCTTG 
GGTGTTGTAA 
CCCCTAAACT 
TCATGCX5TTT 
TTTCTGOAGT 
TCTTTCTACC 
AGGTAGTGTG 



1020 
1080 
1140 
1200 

1320 
13B0 
1440 
1500 
1560 

1680 
1740 
1800 
1860 
1920 



GTGCCTTCCT 
CTCTGTTCXA 
TGAGCftAGAT 



GCTTCATCTG 



AACAAAACCT 
TTCCACTGAA 
CAGTCTATTT 
CTCTCTACCA 
TTTTAACAAC 
GATGTATGGA 
TCRATCACCG 
TAAGCGGTGG 
GAGATAGAAT 
ATTGAGGAAT 
T6TTAAGAAA 



ACACACGTAC 



CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGTGTTG 
TCTGTGTTTG 
TTTGTAATTC 
ACATGTAAGT 
TAACTGCATA 
TGGGTTTCTT 



OQTGTTGGTA 
TCTGTTCAGT 
GTTAGTTTGG 
ATCAGGAATT- 
GGAAOTTAAA 
ATTCCATGTG 
A0GAAATT6T 



AATCCAACAG 
GATGCCCTCA 
AAATGGTACT 
GGACCTAATA 
ATTTAAATGG 
GATATCAGTT 
TAOVATAOAA 
CCAATAGACA 



GAGCTCTTGC 
TCATAATAAA 
AATTTTAGTG 
CTTTTGCCAC 
ACCftAACATT 
TTTATOCAAT 
TGGGG 



CTTTTTCAAT AAATTCTTTT TTAATTTAAA 



2220 
2280 
2340 
2400 
2460 
2520 
2580 



3880 
2940 
3000 



3300 
3360 
3420 



1 11 21 31 41 SI 

MAimGLQt.LG FILAFLGHIG AIVSTALFQH RIYSYAGONI VTAQAMYBGI. WMSCVSQST3 
85 QIQCKVFDSL LNLSSTLQAT RAUtWGIIiL GVIAIFVATV G9<KCMXCIiED DEVQKMRMAV 
IGGAIFIdiAO LAILVATAWV GNRIVQEPYD PMTFVNAEYE FGQALFTOHA AASLCliLGGA 
IiLCCSCPRKT TSYPTPRPYP KPAP^SGKDY V 
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Seq ID HO: 594 DNA sequence 

Nucleic Acid Accession 8> MM_006180.1 

Coding sequence: 352.. 2820 

1 11 21 31 41 51 

CCCCCATTCQ CATCTAACAA GGAATCTGCG CCCXaVOAGAG TCCXX3GAC3GC CGCOGGTOQG 60 

TGCCCGGCGC GCCGGQCXa^T GCAaCGAOSQ CCX3CCGCGGA GCTCCGAGCR GCGGTAG03C 120 

CCCCCTGTAA AGOGGTTCGC TATGCCXXX3A CCACTGTGAA CX:CTGCCX3CC TGCGGQAACa. 180 

CTCTTOGCTC OGGACCAGCT CAGCCTCTGA TAAGCTGGAC TCGGCACGCC CGCAACAAGC 240 

ACOSAGGAGT TAAGAGAGCC GCAAGCX3CAG GGAAQGCCTC CCCGCACGGG TGGGGGAAAG 300 

CGGCOGQTGC A60G0GGGGA CAGGCACTCG GGCTGGCACT GGCTGCTAGG GATGTCGTCC 360 

TGOATAAGOT GGCATGGACC CGCCATGGCG CGGCTCTGGG GCTTCTGCTG GCTGGTTGTG 420 

GGCTTCTGOA GGGCCX3CTTT CGCCTGTCCC ACGTCCTGCA AATGCAGTQC CTCTOGGATC 480 

TGCTGCAGCG ACCXTTTCTCX: TGGCATCXJTG GCATTTCOGA GATTGGAGCC TAACAQTGTA 540 

GATCCTCAGA ACATCaiCCGA AATTTTCAXC GCAAACCAGA AAAOGTTAGA AATCATCAAC 600 

GAAGATGATG TTGAAGCTTA TGTGGGACTG AQAAATCTGA CAATTGTGGA TTCTGGATTA 660 

AAATTTGTGG CTCATAAAGC ATTTCTGAAA AACAGCAACC TGCAGCAC3VT CAATTTTACC 720 

CX3AAACAAAC TaACGAGTTT GTCTAGGAAA CATTTCCGTC ACCTTGACTT GTCTGAACTG 780 

ATCCTG6TOG GCAATCCATT TACATQCTCC TGTOACATTA TGTGOATCAA GACTCTCCAA 840 

GAGGCTAAAT (XAGTCCAGA CACTCAGGAT TTGTACTGCC TGAATGAAAG CAGCAAGAAT 900 

ATTCCCCTOG CAAACCTGCA GATACXCAAT TGTGGnTGC CATCTGCAAA TCTGGCCGCA 9fi0 

CCTAACCTCA CTGTGGAGGA AGGAAAGTCT ATCSkCATTAT CCTGTAQTGT 6GCAGGTGAT 1020 

CCGGTTCCTA ATATGTATTG GGATGTTGGT AACXTr GG fTT CCAAACATAT GAATGAAACA 1080 

AGCCACACAC AGGGCTCCTT AAGGATAACT AACATTTC»T CCGATGACAG TQQGAAGCAG 1140 

ATCTCTTGTG TGGCGOAAAA TCTTGTAGGA GAAGATCS^G ATTCTGTCSUV CXrrCACTGTQ 1200 

CATTTTGCAC CAACTATCAC ATTTCTCX3AA TCTCCAACCT CAGACCACCA CTOGTGCATT 1260 

CCATTCACTG TGAAAGGCAA CCCCAAACCA GCGCTTCAGT GGTTCTATAA CGGGGCAATA 1320 

TTGAATGAGT CCAAATACAT CTGTACTAAA ATAC3VTGTTA CCAATCACAC GGAGTACCAC 1380 

QGCTGCCTCC AGCTOGATAA TCCCACTCAC ATGAACAATG GGGACTACAC TCTAATAGCC 1440 

AAGAATGAGT ATGGGAAGGA TGAGAAACAG ATTTCTQCTC ACTTCATGGG CTGGCCTGGA 1500 

ATTCACGATG GTGCAAACCC AAATTATCCT GATGTAATTT ATGAAGATTA TGGAACTQCA 1560 

GCGAATGACA TCSGGQACAC CACGAACAGA AQTAATGAAA TCCCTTCC31C AOACOTCACT 1620 

GATAAAACCG GTCGGQAACA TCTCTCOQTC TATeCTQTQG TCGTQATTGC OTCTGTGGTG 1680 

GGATTTTGCC TTTTGOTAAT GCT G TTTCTG CTTAAGTTOG CAAGACACTC CAAOTTTQQC 1740 

AT6AAAGGCC CAGCCTCCGT TATCAGCAAT GATGATGACT CTGCX31GCCC ACTCC3VTCAC 1800 

ATCTCCAATG GGAGTAACAC TCCATCTTCT TOMAAQGTG GCCCAGATGC TGTCATTATT 18S0 

GGAATGACCA AGATCCCTGT CATTGAAAAT CCCCAGTACT TTGGCATCAC CAACAGTCAG 1920 

CTCAAQCCAO ACACATTTGT TCAGCACATC AAGCGACATA ACATTGTTCT GAAAAGGGAG 1980 

CTAGGCGAAO GAGCCTTrOO AAAAGTGTTC CTAGCTQAAT QCTATAACCT CTGTCCTGAG 2040 

CAGQACAAGA TCTTGOrGGC A6TGAAGACC CTGAAGGATG CCAGTGACAA TGCACGCAAG 2100 

GACTTCCACC GTOACSQCCQA GCICCTGACC AACCTCCAGC ATQAGCACAT OOTCAABTTC 2160 

TATGQCSTCT GCGTQGAGGG OGACCCCCTC ATCATGQTCT TTGAOTACAT GftAflCATOGG 2220 

GACCTCiAACA AGTTCCTCAG GGCACaVOGGC CCTGATGCCG TGCTGATGGC TGAGGGCAAC 2280 

CX3X:CCACGG AACTGACGCA GTC3GCAGATQ CTGCATATAG CCXSlGCftGAT CXSCXGCGGGC 2340 

ATGGTCTACX: TGGOGTCCXIA GCACTTOGTG CACCGCGATT TGGCCACCAG GAACTGCCTG 2400 

GTOSGGGAGA ACTTGCTGGT GAAAATCGGG GACTTTGGGA TGTCCTGGGA CGTGTACAGC 2460 

ACTGACTACT ACAGGGTCX3G TGGCCACACA ATGCTGCCCA TTCGCTGGAT 2520 

AGCATCATGT ACAGGAAATT CACGACGGAA AGCGACGTCT GGAGCCTGGG GGTCGTGTTG 2580 

TGGOAaATTT TCACCTATOa CAAACAGCCC TGGTACCAGC TGTCAAACAA TX3AGGTGATA 2640 

GA6TGTATCA CTCAGGGCOQ ASTCCTGCAG CGACCCCQCA OSTGCCCCCA GGAGGTGTAT 2700 

GAGCTOATGC TOaGGTGCTG GCASCGAGAG CCCCACATGA GGAAGAACAT CMGOaCATC 2760 

CATACCCTCC TTCAGAACTT GGCCAAGGCA TCTCCGGTCT ACCTGOACAT TCTAGGCTAG 2820 

GGCCCTTTTC CCCAGACCGA TCXTTTCCCAA CGTACTCCTC AGACGGGCTG AGAGQATGAA 2880 

CATCTTTTAA CTGCCGCTGG AGGCCACCAA GCIGCTCTCC TTCACTCTGA CAGTATTAAC 2940 

ATCAAAGACT CCGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCTTCATC CATAGACACA 3000 

GTATTGACTT CTTTTTGGCA TTATCTCTTT CTCTCTTTCC ATCTCCCTTG GTTGTTCCTT 3060 

r r rc r rrrn' taaattttct ■ rri ' i 'L'r i 'ciT ttttttcgtc ttccctgctt cacgattctt 3120 

ACCXrrrrCTT TTGAATCAAT CTGGCTTCTG CATTACTATT AACTCTGCAT AGACAAAGGC 31B0 

CTTAAC3UAC GTAATTTGrT ATATCAGCAG ACACTCCAGT TTGCCCAOSV CAACTAACAA 3240 

TGCCTTGTTa TAXTCCTGCC TTTGATGTGG ATGAAAAAAA GGGAAAACAA ATATTTCACT 3300 

TAAACTTTGT CACTTCTGCT QTACAGATAT CGAGAGTTTC TATGGATTCA CXTCTATTTA 3360 

TTTATTATTA TTACTGTTCT TATTGTTTTT GGATGGCTTA AGCCTOTGTA TAAAAAAGAA 3420 

AACTTGTGTT CAATCTGTGA AGCCTTTATC TATCGGAGAT TAAAACCAGA GAGAAAGAAG 3480 

ATTTATTATG AACCGCSkATA TGGGAGGAAC AAAGACAACC ACTGGGATCA OCTG STGTCA 3540 

GTCXXTTACTT AGGAAATACT CAGCAACTCT TAGCTGGGAA GAATGTATTC GGCAC CTTCC 3600 

CCTQAGGACC TTTCTGAGGA CTAAAAAGAC TACTGGCCTC TOTGCCATGG ATGATTCTTT 3660 
TCCCaVTCACC ASAAATQATA GCQTGCAGTA GAGAOCAAAG ATGQCTT 



Protiein Accession «i RP_006171.1 

1 11 21 31 41 51 

MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSCKCSA SRIKCSDPSP GIVAFPRIiBP 
HSVDPENITE IFIANQKRLE IINEDDVEAY VGIiKNLTIVD SGLKFVAHKA FI1KNSIII.QHI 
NFTRNKLTSIj SRKHPHHIiDI, SELILVGHPF TCSCDIMMIK TUQEAKSSPD TQDLYCLNES 
SKMIPLANLQ IPNCGLPSAH LAAPULTVEE GKSm.SCSV AGDPVPNMVW DVGNLVSKHM 
WETSHTQGSI. RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHPAPTIT PLBSPTSDHH 
WCIPFTVKGN PKPALQWFVN GAIliNESKyi CTKIHVTNHT EYHGCIjQIJJN PTHMNHGDYT 
LIAKNEYGKD EKQISAHFMG WPGIDDGAMP jnTPDVIVEDY GTAAHDIGDT TORSNEIPST 
DVTDKTGREH LSVYAVWIA SWGFCLLVM I.FLLKLARH& KFGMKGPASV ISNraOSASP 
LHHISNGSNT PSSSEGGPDA VllQMTKlPV lEMPQyPGIT H8QIJCPDTFV QRIKRHHIVI. 
KRELGEGAFG KVFIiAECYNL CPBQDKILVA VKTLKDASUtT ARKDFBREAB U.TNLQEIBKZ 
VKFYGVCVEG DPLIMVFEYM XH(33LNKPLR AHGPDAVLMA ESIPPTELTQ SQHLHIAQQI 
AA<a4VYI>ASQ HFVHRDIATR HdiVOSXajLV KIODFGNSRO VYSTDYYRVQ GHTNLPIRWH 



417 



wo 02/086443 

PPESIMYRKF TTESDVWSUJ WLWBIFTTO KQPWVQIiSHN BVlBCITQaR VLQRPRTCFQ 
EVYELMLGCW QRSPHMRKNI KGIHTLIiQNL AKASPVYLDI LG 

Seq ID NOs S96 DMA sequence 
Nucleic Acid Accession »i AF410899 
Coding sequence: 483.. 2999 
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15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



r OGCGCTCTAC 



TGCATAOCGO ACCCCCATTC GCATCTAACA 



AGOGGTAGCG 
CTGCCGGAAC 
CCGCAACAAG 
GT(3GGGGAAA 
GGATGTCGTC 
GGCTGGTTGT 
CCTCTOGOAT 
CTAACAGTGT 
AAATCATCAA 
ATTCTGQATT 
TCAATTTTAC 
TGTCTGAACT 
ASACTCTCCA 

ATCTGGCCGC 
TOOCAGOTOA 
TGAAT6AAAC 



ACCTCACTGT 
ACTGGTGCAT 
ACX3GGGCAAT 
CGGAGTACCA 
CTCTAATAGC 
GCTGGCCTGQ 
AT6GAACTGC 
CAOAOSTCAC 
CQT CTG TGQT 
CCAAGTTTGG 
GTGTTGGCCC 
TCTCCAATGG 
GAATGACCAA 



AGGACAAGAT 
ACTTCCACCG 
ATGGCGTCTO 
ACCTCAACAA 
OOCCCAOGGA 
TGGTCTACCT 



cccccxrrGTA 

ACTCTTCGCT 
CACXMAGGAG 
GCGGCCGGTG 
CTGGATAAGG 
GGGCTTCTGG 
CTGGTGCAGC 
AQATCCIGAe 
CGAAGATQAT 
AAAATTTGTG 
CCGAAACAAA 
GATCCTGGTG 
AGAGGCTAAA 
TATTCCCCTG 
ACCTAACCTC 



AAOCCACACA 
GATCTCTTGT 
GCATTTTGCA 
TCCATTCACT 
ATTGAATGAG 

CAAGAATGAG 
AATTGACGAT 
AGCGAATGAC 
TGATAAAACC 
GGGATTTTGC 
CATGAAAGAT 
AGCCTCCGTT 
GAGTAACACT 
GATCCCTGTC 
CACATTTGTT 
AGCCrrTGGA 
CTTGGTGGCA 
TGAGGCCGAG 
CGTGGAGGGC 
TCCTCAGQ 



AAGCGGTTCG 
CCOGACCAGC 
TTAAGAGAGC 



AGGAATCTGC 
TGCAGOOACQ 
CTATGCCGGG 
TCAGCCTCTG 



CCGCTCAGTC 
GCCGGAGCCC 
GCXXX»GAGA 



TGGCATGGAC 
AGGGCCGCTT 
GACCCTTCTC 
AACATCACOS 
GTTOAAGCTT 
GCTCATAAAG 
CTGACGAOTT 
GGCAATCCAT 



GCAAACCTGC 



ACTGACGCAG 



CTGACTACTA 
GC3VTCATGTA 
GGGAGATTTT 
AGTGTATCAC 
AGCTGATGCT 
ATA CCCT CCT 
GCCCTTTTCC 



CTTGCTQGTQ 
CAGGGTCGGT 
CAGGAAATTC 
CACCTATGGC 
TCAGGGCCX3A 



AATATOTATT 
CAGG6CTCCT 
GTOGCX36AAA 
CCAACTATCA 
GTGAAAGGCA 
TCCAAATACA 
CA6CTGGATA 
TATGGGAAGG 
GGTGCAAACC 
ATCGGGGACA 
GGTCGGGAAC 
CTTTTGGTAA 
TTCTCATGQT 
ATCAGCAATO 
CCATCTTCTT 
ATTGAAAATC 
CAGCACATCA 
AAAGTOTTCC 
GTGAAGACCC 
CTCXTTGACCA 
GACCCCCTCA 
GCACACXK3CC 
TOGCAQAIQC 
CACTTCGIGC 
AAAATOGGQO 
GGCCACACAA 
ACGACGGAAA 
AAACAGCXXrr 
GTCCTGCAGC 



ACCACTGTGA 
ATAAGCTGGA 
GGQAAGGCCT 
GGGCTGGCAC 
GaKCTCTGG 
CACX3TCCTGC 
GGCATTTCCG 
CGCAAACCAG 
GAGAAATCTG 
AAACauSCAAC 
ACATTTCCGT 
CTGTGACATT 
TXTGTACTGC 
TTGTGGTTTQ 
TATCACATTA 
TAACXJTGGTT 
TAACATTTCA 
AGAAGATCAA 
ATCTCXAACC 
AGCGCTTCAG 
AATACATGTT 
CATGAACAAT 
GATTTCTGCT 
TGATGTAATT 
AAGTAATGAA 
CTATGCTGTG 
GCTTAAGTTG 
TTOGATTTGO OAAAQTAAAA 



ACAGGCACTC 
CCGCCATGGC 
TCGCCTGTCC 
CTGGCATCGT 
AAATTTTCAT 
ATGTGGGACT 
CATTTCTGAA 
TGTCTAGGAA 
TTACATGCTC 
ACACTCAGGA 
AGATACCCAA 
AAGGAAAGTC 
GGGATGTTGG 
TAAGGATAAC 
ATCTTGTAGG 
CATTTCTCGA 
ACCCCAAACC 



SI 
I 

CCCGGCGGTA 
GGCCTCGAGG 
GTCCCGGACG 
AGCTCCX3AGC 
ACCCTGCCSC 
CTCX3GCACGC 
CCCCGCACGG 
TGGCTGCTAG 
GGCTTCTGCT 
AAATGCAGTG 
AGATTGGAGC 
AAAAGGTTAG 
ACAATTGTGG 
CTGCAGCACA 
CACCTTGACT 
ATGTGGATCA 
CTGAATGAAA 
CCATCTGCAA 
TCXrrGTAGTG 
TCCAAACATA 



ATCCCACTCA 
ATGAGAAACA 
CAAATTATCC 
CCACGAACAG 
ATCTCTCGGT 



OGGAAGGTGG 



CCCAOATGCT 
TGQCATCACC 
CATTQTTCTG 
CTATAACXrrC 
CAGTQACAAT 
TGAGCACATC 
TGAGTACATG 
GCTGATGGCT 
CCAGCAOATC 



GATTCTGTCSV 
TCAGACCACC 
TGGTTCTATA 
ACCAATCACA 
GGGGACTACA 
CACTTCATGO 
TATGAAGATT 
ATCCCTTCCA 
GTGGTGATTG 
GCAAGACACT 
TCAAGACAAO 
CTCCATCACA 
OTCATTATTO 
AACAGTCAGC 



TOTCCTGAGC 



GTCAAGTTCT 
AAGCATGGGG 
GAGGGCAACC 



ACrnGGQAT 
TGCTGCCCAT 
GCXSACGTCTG 
GGTACC3«3CT 
GACCCOSCAC 
CCCACATGAG 
CTCOGGTCTA 
GTACTCCTCA 



3 AACTGCCTCG 



GTCAAACAAT 
GTGCXCCCAG 
GAAGAACATC 
CCTGGACATT 



TCAAAGACTC 
TATTGACTTC 
TTCTTTTTTT 
CCCTTTCTTT 
TTAACAAACG 
GCCTTGTTGT 
AAACTTTGTC 



ACTTGTGTTC 
TTTArTATGA 
TCCCTACTTA 
CTGAGGACCT 
CCCATCACCA 
ATGGCXSCATA 



TTTTTGGCAT 
AAATTTTCTT 
TGMTCAATC 
TAATTTGTTA 
ATTCCTGCCT 
ACTTCTGCTG 
TACTGTTCTT 
AATCIG-TOAA 
ACOGCAATAT 
GGAAATACTC 
TTCTCAGGAG 
GAAATGATAG 
GTGTGCTCGG 



Taap GQAAG 
TATCTCTTTC 
TTTCTTCTTT 
TGGCrrCTGC 
TATCAGCAGA 
TTGATGTGGA 
TACAGATATC 



TTCT TCATC C 
TCTOCCTTQG 
TCCCTGCTTC 



GCCTTTATCT 
GOGAQGAACA 
AGCAACTGTT 



TCTCTTTCCA 
TTTTTCOTCT 
ATTACTATTA 
CACTCCAGTT TGCCCACCAC 
TGAAAAAAAG GOAAAACAAA 
GAGAGTTTCT ATGGATTCAC 
QATGaCTTAA GCCTQTGTAT 
AT(»GASSATT AAAACCAGAG 



3 AATOTATTCO 



GTCGTGTTGT 
OAGGTGATAQ 
GAGGTGTATG 
AAGGGC3VTCC 
CTAGGCTAGQ 
OAGGAIGAAC 



ACGATTCTTA 
GACAAAGGCC 
AACTAACAAT 
TATTTCACTT 
TTCTATTTAT 
AAAAAAGAAA 
AGAAAGAAGA 
CTGGTOTCAG 
GCACCTTCCC 



CGTGCAGTAG 
ACACAGTTTT 
ACAGAACCTT 
GGTCAGGTGG 



GTCTTCGTAG 
TGTCAACTTC 
GAAAGCC 



TQCCrrcCST GAGAC ACAAG 
GTTGTGATGA TAGCACIGGT 
AOTTC3AAAAO AGaTOGATTC 



H 



21 



31 



41 



51 



1020 
1080 
1140 
1200 



1500 
1560 
1620 
1680 



1980 
2040 
2100 
2160 
2220 
2280 

2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3S60 
3720 
3780 
3840 
3900 



I I I I I 1 

MSSWIRHHGP AMAHIiWGFCW IiWGFWRAAP ACPTSCKCSA SRIWCSDPSP GIVAFPRLKP 
NSVDPENITE IFIANQKRLE IINEDDVEAY VGLRNLTIVD SGIJCFVAHKA FUCMSNUiHI 
NPTRHKLTSIi 8RKHFRHU1L SBIiIIiVGNPP TCSCDIMWIK TLQEAKSSPD TQDIiYCIiNES 
SKNIPIiANIiQ IPMOGUSAN UAAPHIiTVEE QKSITLSCSV ACffiPVPHMYH DVGHIiVSKHM 



418 



15 



55 



WO 02/086443 PCT/US02/12476 

NETSHTQGSL RITMISSDDS GKQISCVAEN LVGEaJQDSVN LTVHFAPTIT FLESPTSDHH 300 

WCIPFTVKtai PKPALOWFYN GAILHESXYI CTKIHVTNHT EYHGCLQ1J3H PTHMNUGDYT 360 

LIAKNEYGKD EXQISAKmS WEGIODGANP NYPDVIYEDY GTAANDIGDT TMRSNEIPST 420 

DVTDKTGREH LSVYAVWIA SWOFCUVM LFI.I.KLABHS KFGMKBPSKF GFGKVKSRQC 480 

VGPASVISND DDSASPLRHI SNGSHTPSSS EGGPDAVIIG HTKIPVIEtfP QYFGITHSQL 540 

KPDTFVQHIK RHNIVUCREI. GEGAFGKVFI. AECyin.CFBQ DKILVAVKTb KDASmiARKD 600 

PHREAELLTN LQKEHIA«CFY GVCVEGDPLI MVFEyMKHGD LHKFUUVHGP OAVLMABSIP 660 

PTELTQSQML HIAQQIAAGM VYtASQHFVH KDIATRHCLV 6ENLI.VXIGD F6MSRDVYST 720 

DVYHVOGHTM LPIRWMPPES IMYRKFTTES DVWSLGWLN BIFTYQXQPM YQLSNMEVIE 780 
CITQGRVLQR PRTCPQEVYE LTOGCHQREP HMRKNIKGIH TLIK3NLAKAS PVYIJ)ILG 

Seq ID KO: 598 DMA sequence 
Nucleic Acid Accession #> AB0S2906 
Coding sequence: 74.. 814 



75 



85 



1 11 21 31 41 51 

I I I I I i 

AAAACCTTGA GGTCATTCAT CTTCCAGGCT CTCCTTCCAT CAAGTCTCTC CTCCCTAGOG 60 

CTCTGGGTCC TTAATGGCAG CAGCCGCCGC TACC3UVGATC CTTCTGTGCC TCCCGCTTCT 120 

20 GCTCCTGCTO TCCBGCTGGT CCCGGQCTGG GCGAGCOGAC CCTCACTCTC TTTGCTATGA 180 

CATCACCXTTC ATCCCTAAGT TCAGACCTGG ACCSW33GTGG TGTGCGGTTC AAGGCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGGCAAC3«G ACAGTCACAC CTGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAGAACCCAG TACTGAGAGA 360 

^_ GGTGQTGGAC ATACTTACAG AGCAACTGOS TOACATTCAG CTGGAOAATT ACACACCCAA 420 

25 GGAACCCCTC RCCCTGCAGQ CCAGQATGTC TTGTGAGCAG AA AGCTG AAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT TCGATGGGCA'GATCrTCCTC CTCTTTGACT CAGAGAAGAO 540 

AATGTGGAC3V AOGaXTCATC CTGGAGCCAG AAAGATGAAA GAAAAGTGGG AOAATGACAA 600 

GGTTGTGGCC ATCTCCTTCC ATTACITCTC AATGGGAGAC TGTATAGGAX GGCTTGAGGA 660 

. CTTCrroATQ GGCATGGACA GCACCCTGGA GCCAAGTGCA GGAGCACCAC TCGCCATGTC 720 

30 CTCAQGCACA ACCCAACTCA GGGCCACAGC CACCACCCTC ATCCTTTGCT GCCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TCCCTGGCAT CTOAGGAGAG TCCTTTAGAG TGACAGGTTA 840 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTQCCC ACQACCTACX3 GTG-EATGTCC AGTGGCCTCC AGCavSATCaVT GATGACATCA 960 

TGGACOCAAT AGCTCATTCA CTGCCTTQAT TCCTTTTaCC AACAATTTTA OCAGCAGTTA 1020 

35 "TRCCTAACaVT ATTATGCAAT TTTCTCTTGQ TGCIACCIGA TGGAATTCCT GCACTTAAAO 1080 

TTCTGGCTOA CTAAACAAOA TATATCATTT TCTTTCTTCT CTTTTTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAAIXSATQAT CTCTTTCTTG CAAATGATAT TQTCA QTAAA ATAATCACX3T 1200 

TA6ACTTCAG ACCTCXGOGG ATTCTTTCOG IQTCCTSftAA GAGAATTTTT AAATTATTTA 12S0 

. . ATAAQAAAAA ATTTATATTA ATQATTGTTT CCTTTAGTAA TTTATTGTTC TOTACTOATA 1320 

40 TTTAAATAAA GAOTTCTATT TCCCAAAAAA AAAAAAAAAA AA 



45 1 11 21 31 41 51 

I I I I I I 

MAAAAATKIL LCliPIiUililiS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 
FLHYDCGNKT VTPVSPLGKK LNVTTAWKAQ NPVLRBWDI LTEQIiRDIQI. EHYTPKEPLT 
LQARMSCEQK ABGHSSGSWQ FSFDGQIFLL FDSEKHMWTT VHPGARKMKE KWEWDKWAM 

50 SFHYPSMGDC IGWr,EDFrJ<Q MDSTLEPSAG APIiAMSSGTT QLRATATTLI LCCUjIILPC 
FILPGI 



Seq ID mOi 600 DHA sequence 

Nucleic Acid Accession 8: MM_00189a.l 

Coding sequence: S7..482 



1 11 21 31 41 51 

I I I I I I 

GGCTCTCACC CTCCTCTCCT GCRGCTCCAG CTTTGTGCTC TGCCTCTQAG QAGACCATGG 

60 CCCAGTATCT GAGTACCCTG CTGCTCCTGC TGGCCACCCT AGCTGTGGCC CTGGCCTGGA 
GCCCCAAGGA GGAGGATAGG ATAATCCCGG GTGOCATCTA TAAOQCAGAC CTCAATGATG 
AGTGGGTACA GCGTGCCCTT CACTTCGCCA TCAGCGAGTA TAACAAGGCC ACCAAAGATG 
ACTACTACAG ACGTCCGCTG CGGGTACTAA GAGCCAGGCA ACAGACCGTT GGGGGGGTGA 
ATTACTTCTT CGACGTAGAG GTGGQCCGCA CCATATOTAC CAAGTCCCaG C CCAACT TGG 

65 ACACCTGTGC CTTCCATGAA CAGCCAGAAC TGCAGAAGAA ACAGTTGTGC TCTTTaOAGR 
TCTACGAAGT TCCCTGGGAG AACAGAAGGT CCCTGGTOAA ATCCRGGTCT CAAOAATOCT 
AGGGATCTGT GCCAGGCCAT TCGCACCAGC CACCACOCAC TCCCACCCOC TaTAOTaCTC 
CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCA1GTGOC TGCGCCAAGA 
GACAGACAGA GAAGGCPGCA GGAGTCCTTT GTTGCfTCAGC AGGSOBCTCT GCCCTCCCTC 

70 CTTCCTTCTT GCTTCTAATA GCCCTGGTAC ATGGTACACA CCCOCOCACC TCCTGCAATT 
AAACAGTAGC ATCGCC 



1 I I 1 I i 

MAOYLSTLIiL LLATLAVAUV WSPKEEDRII PGGIYHftDIiS DESJVQRALHF AJSEyNXATX 
DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTKSQPN LOTCAFHEQP ELQKXQLCSF 
BIYBVPWENR. RSIiVKSRCQB S 

Seq 10 NO: 602 CNA sequence 

Nucleic Acid Accession »> NM_003976.2 

Coding sequencei 299.961 
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20 
25 
30 
35 
40 



WO 02/086443 

CTCTGAGCTT CTCTGAGCCT 
CATGGAGTTG TGAAAQAATA 
CTACTTCTGC TGGGTTGAGT 
GGGTGGCAGG CCX3GTCCCCC 
CAGGAGGGTG GGGGAACAGC 
QGAACTTGGA CTTGGAGGCC 
tGCCCTGTGG CCCACCCTGG 
GGGCTCCGCG CCCCQCAGCC 
CXK:C3GGCCAC ctgccggggg 
GCCGCCGCAG CCTTCTCGGC 



PCT/US02/12476 



GGGCTGCCGC 



CGACCTCAGC 
GCCCGTCAGC 
CaUVCAGCACC 
AGGGCTCGCT 
CCTCrOGCAG 
AGGCCMCTAC 
CAOCCCCAGA 



CCCTCCTCTQ 
ACAGCATTTG 
CCTGTACTCA 



CTGCGCTCX5C 
GTGCGTTTCC 
CTGGCCAGCC 
CAGCCCTGCr 
TGGAGAACCG 
CCAGGGCTTT 
AGTCCCACTA 
CGGTGGGTGA 
GCCCTCACCC 
GGACCCACTT 
ATGAACACTA 
AAGGRCACAT 
CTCATGGGAG 



TGTTTGCTCA 
GCTGCAAAGC 
CTAGCTGTGT 
ACAAAAGATA 
TCAACAATGG 
TCTCCACGCT 
COGCTCTGGC 



CCGCX3CCX:CC 
CTGGGGGCCC 
A6CTGGTGCC 
GCTTCTGCAG 
TACTGGGOSC 
GCCGACCCAC 
TGGACCX3CCT 
GCAGACTGGA 
GCCAGCGGCC 
TGGATATCRT 
TGOSGATCCC 
CTCACAGACT 
CAGTGGCTGA 
ATTGCAGTTG 
CTGGCCXC 



TCTGGAAAAA 
ACCTAACACA 
AGGCCCCTTG 
ACTCATCTCT 
CTGATGGGCG 



TCTGCTGAGC 
CGAAGGCCCC 
CCQCTGGTQC 
GCCGCCTGCA 
GGGCAGCOGC 



GCGCTAOaUV 
CTCCGCCACC 
CCCTTACCX3G 



AGCCTAAAAG 
CTGGCACTGG 
GGCATCAGCC 
CTTGGTTGAA 



GGGGATTAAA 
TAGTAAGGTT 
TTCCTCACCT 
TAATTTGCAA 
CTCCTGGTGT 
CCCTGGCCTA 
AGCGTCGCAG 
CCGCCTCTCC 
AGTGGAAGAQ 
CCCCXaVTCTC 
GCTCGGGCAG 
CTCGGOCTGG 
CGCCGCGCGC 
GGACXIGCCCC 
GCIGGTCTCXrr 
GCCTGOGGGT 
TGGCTCTTCC 
AOGAAGGCCT 
GTGAAGGGAC 



GCTGCCTCAA 240 

TGATAGAGAT 300 

GGCGGCAGCC 360 

AGGCCTCCCT 420 

TGOCGTCXrCC 480 
CCXIGQCGGCC 
CTCTTCCCCG 



GCCTGGGCTG 960 



CCAGGCCTCG 
CCCGCCCAGG 
AGTGCCTGTQ 



CAAAGCTGAG 
AACTGACTAG 
CCTCAGCTAT 
AACCTGGGAC 
CCXrTGTAGGG 
CTGGAACTOa 



21 



31 



SI 



III!. 

MELGLGGLST LSHCPWPRRQ PALWPTIAAli ALI^SVAEAS LQ3APRSPAP REX3PPPVLAS 

PAGKLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAAKAGG PGSRARAAGA 

RGCRLRSQLV PVRALGLGHR SDEX.VRFRFC SGSCRRARSP HDLSIASLLG AGALRFPPGS 

RPVSQPCCRP TRYEAVSFMD VKSTWRTVDR LSATACGCLG 

Seq ID NO: 604 DNA sequence 

Huclelc Acid Accession #s NM_057091.1 

Coding sequence: 783.. 1445 
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75 
80 
85 



CGCGTGTCTA CAAACTCaAC 
CTCCATATCX: GAGGGGCCCC 
CAAGCTAGGO GGGACTGGAT 
CXXaSGCAGGG GCGCTCCCAG 
CACXXK.ACGG CTGCGGCGGC 
CAQACAAGGC CCGGGGGCTC 
CCCGGGCCTG GAGCCCCACA 



CGTQCCTCTC 

taccxx:cx:tc 
gagcagccag 
ggatctggtg 

CTGCTGAGGG 



TCAACAGGAO GGTGGGGGAA 
AGATGGAACT TGGACTTGGA 
AGCCTGCCCT GTGGCCCACC 
CCCTGGGCTC CGCX3CCCCGC 
CXXCCGCaSG CCACCTGCCG 

aaccGccGcx: gcagccttct 



CTTTCTCCOQ 
TCCCGGTTTC 
TCCCAGCATC 
CCGACGGGTG 
CCCCACCCCG 
GGGCAGGAGG 
CGCCAGCAGC 
CCXX3ASGGTa CAOACrGGCT 
GTACAGTCCT 
AAGGTGCCTA 
CAGCTCAACA 
GGCCTCTCCA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
CGGCCOGCGC 



GCTOCX3AOQA 



ACGTCAACAO 
QCTGAGGGCT 
GGACCCTCCC 



CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 



CTGGCCTGTA 



GCTGGTGOST 
CAGCCTGGCC 
CAGCCAGCCC 
CACCTGGAGA 
CGCTCCAGGQ 
GCAGAGTCCC 
CTACCGGTGQ 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



TCGCAGCTGG 
TTCCGCTTCT 
AGCCTACTGQ 
TGCTGCCX5AC 
ACCGTGGACC 



ATGGCTGK3G 
CGCTGTCCCA 
TGGCTCTGCT 
CCCGCGAAGG 

CGGCcascTa 

CCCCGCCGCC 
GCCCGGGCRG 
TGCCGGTGCG 



CCAACCTCGG 
GTGAGCCCC6 
ACGCTGGGGC 
ATGGAGTTGG 
GGCX;CCAGCC 
GCCAAGGCCA 
TGTTrGAGCT 
GTGCAGGACC 
GGCGCTCCTG 
CTGCCCCTGG 
GAGCAGC6TC 

ccccccxsccr 

GTGCAGTGQA 
TGCACCCCCA 
CCGCGCTCQG 



CTCGCTGCCA 
CACTTTTGGC 
TCGGGGGAGA 
CCGTGCTGCC 
GTGTTGATAG 
CCTAGGCGGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCXXK3C 
TCTGCTCTTC 



GCAGCGGCrc 



ACTAGCCAGC 
GTGATGGATA 
ACCCTGCGGA 
ACTTCTCACA 
ACTACAGTOG 
ACATATTGCA 



CCACX3CGCTA 
GCCTCTCCGC 
TGGACCCTTA 



TCATCXXCGA 
TCCXaGCCTA 
GACTCTGGCA 
CTGAGGCATC 




11 



31 



41 



51 



I I I 1 I 

MELGLGGLST LSHCPWPRRQ PALnPTLAAIi ALLSSVAEAS LGSAPRSFAP REGPPPVLAS 
PAGHLPGGRT ARHCSGRARS PPPQPSRPAP PPPAPFSAI.P RGGRAARAGG EGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SOBLVRFRFC SGSCRRARSP BOLSLASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVOR LSATAOQCLG 



420 



20 



WO 02/086443 

Coding sequence: 1..714 



PCT/US02/12476 




CAGASTCCCA 
TACCGGTGGG 
AGAGCCCTCA 
TTCGGACCCA 
CTQATQAACA 
TTGAAGGACA 
TCACTCATGG 



CTAGCCAGCQ 
TGATOSATAT 
CCCTGCGGAT 
CTTCTCACAG 
CTACAGTGGC 
CATATTGCAfi 
GAGCTGGCCC 



CCCCTCCOCC 
OGAOCCTTAC 
GCCTCAGCCA 
CATCCCCGAA 
CCCAGCCTAA 
ACTCTGGCAC 
TGAGGCATCA 
TTGCTTGGTT 



I I 

AGGTCCTTCC TCCtX»AGCC 60 

TCTCOGCGCA GCCTCCCCIG 120 

CAGAGGOCTC CCTGGGCTCC 180 

TCCTGGOGTC CCCCQCOGGC 240 

GAGCOCGGOG GCC8COGCGG 300 

CTGCTCTTCC CCXKXK3GGQC 3fi0 

CAGOSGGGGC GCGGGGCTGC 420 

TGGGCCACXX3 CTCCGACGAG 480 

CGCGCTCTCC ACACGACCTC 540 

CXXXX3G0CTC COGGCCCGTC 600 



1020 
1080 
1140 



GGQACOAAOQ CCTCAAAOCT GAGAGGCOCC 
CAGGTGAAGG 6ACAACTGAC TAGCAGCCCC 
AAGACACCAG AGACCTCAGC TATGGAGCCC 
TGGCCAGGCC TCGAACCTGG GACCCCTCCT 
GCCCCCX3CCC AGGCCCTGTA GGGACAGCAT 
GAAAGTGCCT GTGCTGGAAC TGGCCTGTAC 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



11 



21 



31 



51 



I I I I I 

MPGLISARGQ PUiBVIiPPQA HIiGALFLPBA PliGLSAQPAL WPTLAAIiALL E 
APRSPAPREG PPPVI.ASPAG KI.PGGRTARH CSGSARRPPP QPSRPAPPPP APPSALPSG6 
RAARAGGPGS RARAAGARCC RUiSQLVPVR ALGLGHRSDE LVRFRFCSGS CRRARSPBDL 
SLASLIiGAGA LRPPPGSRPV SQPCCRPTRY BAVSFtffiVNS TWRTVDBIiSA TAGOCLO 

Seq ID NO: £08 DMA sequence 

Nucleic Acid Accession ft: MM_057090.1 

Codiag sequence: 29. .715 



CTGATG6GCG CTCCTGGTGT 
GTCCCACTGC 
GTOOCCXIACC 
CGCGCCCOOC 
CCACCTGCCG 
GCAGCCTTCT 



21 
I 

TGATAGASAT 
GGOSGCAGQC 
TGGCTCTOCT 
CCCGCGAAGG 
CGGCCCGCTG 



31 



41 



51 



I I 
GGAACTTOQA CTTGOAOGCC TCTCCAOGCT 
TOCACTTQGT CTCTOOSOSC AGCCTGOOCT 
GAGCAGCQTC QCAOAGGCCT CCXMGGGCTC 
COCCCOSCCT GTOCTGGCGT CCCCXZGCCGG 




TGCACCCCCA 
CCGCGCTCGQ 
CGCGCTCGGC 
CTGCCGCCGC 
CCTGCGACOG 



QCGCQCTCTC 
CCCCOGGGCT 
TCCTTCATGG 



TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



OACTCTGGCA 
CTGAGGCATC 
QTTGCTTGGT 



CCSGTGGCTC 
AGGGAOQAAG 
ACAGGTGAAO 
AAAGACACCA 
CTGGCCAGGC 



TCTGCTCTTC CCCGCGGGGG 360 



1020 
1080 
1140 



GCCrCAAAGC 
GGACAACTQA 
GAGACCTCAQ 



CCCGGCCCGT 
ACX3TCAACAG 
GCTGAGGGCT 
GGACCCTCCC 
TGAGAGGCCC 
CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCX: 



TGAAAGTOCC TQTOCTGQAA CTGGCCTOTA 



11 



21 



31 



SI 



I I I I I . 

MEL6LGGI.ST LSHCPWPRRQ APLGLSAQPA LNFTLAALAL LSSVAEASLG SAPRSPAFRE 
GPPPViASPA GHLPGOItTAR WCSGRARRPP FQPSRPAFPP PAPPSAIiPRG ORAARM3GPG 
SRARAAGARG CRIJISOI.VPV RALGLGHRSD ELVRFRFCSG SCHRARSPHD LSLA S LLBAG 
AIiRPPPGSRP VSQPCCRFTR YEAVSEVIDVH 87MRTVDIU:>S ATAOGCLO 



Nucleic Acid A 



ATGCCACTGA 
GCCTACCATO 
GGGGCACGCA 
CTCAACACGC 
QCCCTOAGGA 
GGCTCGCTGC 
TTCCAGGGCC 
CAGCOGGCCX: 
CTGGAATACA 



) ONA sequence 



CCTTTTGCTG 
CGAGTCTACC 
GCCCACCCCT 
ACTCAATGAG 



11 
I 

AGCATTATCT 
GCTGCCCTAQ 
TTGTGGCX3GT 
ACATCACTGA 
TTGAGAAGAA 
GCTATCTCAG 
TGGACAGCCT 
ACTTCTCCCA 
TCXZCTGACGO 



31 
I 

GTGGGCTGCC 
TGCTCCAGGG 
CTGCCCTGGA 
TCCCCOTTCC 
CGCATCACGC 
AAC3WGCTCC 



AAGCCTGGGG 
CCTCCCAGGT 
ACGCXIATGAO 
TCAATATCTC 



AGCCCTCATC 
COGAAACCIG 
CAT0G6CCTC 
OTTGCAGATC 



AGCCTTGGAC C 



STAG 6ACTCA0GAA GCTCAATCTO 
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GGC3UWSRATA GCCTCACXXa CATCTCACCC AC3GGTCTTCC AGCACXTTGOQ CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCXCCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCKTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACA&CCACAT CTCCCAGCTG 780 

CCACCX»GCA TCTTCATGCA GCTGCCCCAG CTCSVRCOGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCT6CG GATCTTOGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATOACAACX: ACATCTCTTC TCIACCCGAC AATGTCTTCA GCAACCTCOQ CCACTTeCAO 960 

GTCCTGATTC TTAQCCGCAA TCAGATCAQC TTCATCTCCC CGG6TGOCTT CAACQGGCIA 1020 

AOSGAGCTTC GGQAGCTOTC CCTCCACACC AACQCACTOC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CaUSACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCXav TCCAGCTGCA GAACAACCAG 1200 

CTGGAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 1260 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTOKC TCCGCAACTQ GCTCCTGCTC 1320 

AACCAQCCTA GGTTAaGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAATGTCCGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTCCTGTTC CAAGOGTCCA TGrCCXTTOAG 1440 

GTCCCTAGTT ACCCAGAAAC ACCATG6TAC CCAQACACAC CC3U3ITACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC OCTOTGGAAa ACTACACIGA TCTQACTACC 1S60 

ATTCAGGTCA CTGATGACCG CAGOGTTTGG GGCATGACCC AGSCCCAGAG CGGOCTGOCC 1620 

ATTGCXIGCCA TTGTAATTGG CATTGTCGCC CTQGCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAO GAGCCAAGCT GTCCTGATGC AGATQAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC CGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTOTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 

COTGCCGGAC CTTCCTACAA TCAGGAAGAT AaATCC3VACT GGCCATGGCA AAAGCCCTGG 1980 

GOAV n 'OO G A TTCATACCCC TOGGCTTCMI TOQAGAiaOaC TCTTCCTCCA AATCCTCCCC 2040 

ACCTGTCCIC CAAGAACAGC CTTCCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTOGTGGG AATAGTTCTC CGCTGAGATA GCX:CCTCTOB 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTrGTTTCT CTTGTTTCTG CTATGGCTTG 2220 

ACOCAGCATQ TCCCCTCAAA TGAAAGTTCT CCXCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TQWBTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAA6AG AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2400 

AGACAOAAGA GCCGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCACA GCAAGCTCAQ CCTTTTRGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TQAAAAQTTT AGCCCTTTAA GGAATGAAAT CATGTAOAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATAOG GGATAOAGAA AGAAATCTGQ TGCCIGOGGO TCCCT8TQTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATO TGAAGIGTAC STGCAGAAAA 2700 

GTGGGAACAT GATAQTGTAT GGCTTGGTGG ATTTTCACAA ACTSAACATA CCTQTGTAAT 2760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCX: CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCXXCATGA GCCAGGACGG 2880 

TCXXXCCACA GTCaGCCXGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGQ 2940 

TOTGGACAGa ATGSGAGACT GTGGCCTGAA CAQGAQATTT TATTATATCT GGA6ACCCTG 3000 

ASAGACCCro AGACCTGGGG CACCATGGCT GGCCAGGTCai GAAGCATCCT aACTGCAGAG 3060 

GTCCGTGCAC CC3«»CCCTC TTCCCTGCCA GCAAGTTOTC TGOGOCTCAT CGGAGGCCCC 3120 

TCCGCCTG6A GCCTTCTATG OACXnGATAT GCCIGTAXCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AST6AAATCG CTCAGAGATG AGATCXTTTTA ATTCAAAACXS AAGTQTAACQ 3240 

GAATCTAQTG TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TGAACTTC3U3 AATCTCACTT ACAGCAGGCO ACACGGGGGT ACACCGATGG QTCACACTGQ 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGOG TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACACGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACa CATATTCACA TGGCGCTC3VA GAAGTTAGGC TCATOGC3UVC GTGTCTCTTT 3540 

CTCTGGACRA CTGGCCCaVGT TTACRGTGAA ATGGAGAATT TC3W3GTCTCC ACGTCTGCCC 3600 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCXX3ATCGGC 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCnTGG AAATCCACCA CCAATCCCGA 3720 

TCX3GCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTCA TCTGGAAATC TACCACCAAT 3780 

CCCGATCGGC TCTTATTAGC TCCCCXKTTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAQTG TCCAOAOaaC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCr TTTCACCCTC TGTCCCAGGO AATCTAGGAQ 3960 

AGATGAGGCC CGTCAGAGTC AAGAGATQTC ATCCCCCCAG GGTCTCCAAa GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATOCACC AAGGCTTGCC AGAGCCAACA GGAAOTGAOC 4080 

CCAGAGCATG GCACATGAGC ATCACCCGCT GATQGTGGCC TGCTGTGCCT GGTQCCAACA 4140 

GGGGCATCO: GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AGACCTGTCQ 4200 

GGTGCTCCTG TQAGTGGCCT CCAGATGTCT TTGTGCATAG GCACAAGTGG GCCAGGGCTG 4260 

GAGGGAGOTQ GGAAACCTCA TCATCOGGTG GGCCCIGCCA ATCTTAACCC AGAACCCTTA 4320 

GGTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGOCAO AGGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGOTGAGGG GIGAGGGCCC 4440 

CCTTATGTOA ACCrCTTaCC TCTTCCTTTC TCOCKTCIK3A GTSGTTaGAT GGASCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCIGCACC ATGTTGTCTG GCTQBUSGAGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTQC TCAATTCTCA 4620 

GGGCTGGAAT GAGCOGGCTG GTCCCCCAGA AAOCTGGAGT GGGGTACAQA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACX3CCCATCT GGAGTGGGAG CrGGGAGTTA 4740 

GTGTTGQAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

TGCACAQATA CTCTTCAAGC ACTOOAOaTG OATTCTCTCT CTAGCCCTCA GCACCCCTGC 4860 

GCTAGGAGTG CCGCCTCTAC CCACTTGTGA TGGGSXACAG AGGCACTTGC TCTTCTGCAT 4920 

GOTSTTCAAT AGQCTGGGAG TTTTATTTAT CTCTTCAAAC TTTOTACAAG AGCTCATGGC 4980 

TreTCTTGGG CTTTGSTCAT TAAACCAAAG GAAATGOAAO OCAITCOCCT GTTaCTCrCC 5040 

TTAQTCTTGQ TCATCAGAAC CTCACTTOer ACCRTAXAGA TCAAAAGCTT TCTAACCACA 5100 

GOAAAAAATA AACTCrrOCA TCCCTTAAAO AATAGAATAG TTTGTCOCTC ICATGQGAAT SISO 

TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

AACTTTTCAT GGACACAATT TCCACAACCT TTCAGATGCT GATGTAGAGC TATTGGGAAA 3280 

GAACTTCCAA ACTCAGGAAG TTTGCAGAGA GCAGACAGCT AGAGATAACT CGGGACCCAG 5340 

AGTTGGTOQA CAGATGTTAG ATGTATCCTA GCTTTTAGCX: ATAAACCACT CAAAGATTCA 540 0 

QCCCCCAGAT CCCACAGTCA QAACTGAATC TGCGTTGTTG GGAAGCCAGC AGTGOCCrTG 5460 

GOAAGGAAGC CATGGCT6TQ OTTCAQAGAG GGIGGGCTGG CAAfiCCACTT CCGGGGAAAA 5520 

CTccrrccGC ccxaGorrTC xTcncTcrr AAGananQAT Torrcieacc arcccgcigc 558o 

CTTCATGCTG CCTTCAAAGC TAGATCMXJT TTGCCTEGCT TAGAQAATTA CTGCAAATCA 5640 

GOCCCAOTGC TTGGGGATGC ATTTACAGAT TTCTAGGCCC TCAGGQTTTT GIAOASTGrO 5700 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT CTTCrG C TOQ ATGCIQCTTO ZAATCCATTT 5760 
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GGTGTACM» ATCAACAATA AATAATATAC ATBTAT 



1 11 21 31 41 SI 

I I I I I I 

MPLKHYIiLLL VGCQAWGAGL AYHGCPSECT CSRASQVECT GARIVAVPTP LPWNAMSLQI 
LNTHITELNE SPFLNISALI ALRIEKNELS RITPGAPRKL GSIiRYLSLAN NKLQVLPIGL 
PQGLDSIBSL Ll^SNQLLQI QPAHFSQCSN LKELQUICaaH LEYIPDGAFD HLVGLTKLNIi 
OKNSLTHISP RVFQHtGNljQ VLRI,yBNRI.T DIPMGTFDGIi VNLQEIALQQ MQIGLIiSPGL 
FHNNHNliQRIi YLSNNHISQIj PPSIFMQLPQ LNRLTLFGNS LKELSLQIFG PMPNLRELWIi 
VDNHISSLPD HVFSNLRQLQ VLILSRNQIS FISPGAFKGIi TELRELSLHT NALQDLDGNV 
PRMUttlliQNI SLQMMRUtQIi P^IFANVNG LMAIQLQNNQ USNLPLGIFD HI^KLCEUil. 
YDHFWRCDSO ILPLRNWI.I.L NQPRIiGTOTV PVCFSPANVR GQSIiIIINVN VAVPSVHVPE 
VPSYPETPWY PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTMJRSVW GMTQAQSGU^ 
lAAIVIGIVA LACSLAACVG CCCCaCKRSQA VLMQMR3VPHB C 

XM_098151 
Coding sequence: 1..447 

1 11 21 31 41 51 

I I I I I I 

ATGATGCATT TGCTCAATTC TCAGGGCTGG AATGAGCCGG CTGGTCCCCC AGAAAGCTGG 
AGTGGGGTAC AGAGTTCAGT TTTCCTCTCT GTTTACAGCT CCTTGACAGT CCCACOCCCA 
TCTGGAGTGG GAGCTGGGAG TCAGTGTTGG AGAAGAAACA ACAAAAGCCA ATTAGAACCA 
CTATTTTTAA AAAGTGCTTA CTGTGCACAG ATACTCTTCA AGCACTGGAC GTGGATTCTfc: 
TCTCTAGCCC TOUjCACCCC TGCGGTAGGA GTGCOGCCTC TACCCaCTTQ TQATGGGGTA 
CAGAGGCACT TGCTCTTCTa CATGGTQTTC AATAGGCTGG GAGTTTTATT TATCTCTTCR 
AACTTTGTAC AAGAGCTCAT GGCTTGTCTT GGaCTTTOOT CATTAAACCA AAGOAAATOQ 
AAGCCATTCC CCTGTrOCTC TCCTTAG 



I I I I I I 

MMHLLNSQGW KEPAGPPBSW SGVQSSVFLS VYSSLTVPRP SGVGAGSQCH RRNUXSQLEP 60 

LFLKSAYCAQ II.FKHWTWIL 8LAI.STEAV6 VPPLPTCDGV QRHUFCMVF NRLGVIiFISS 120 
NFVQELf4ACL GLSSLNQRKH KPFPCX:SP 

Seq ID NO I 614 DHA sequence 

nucleic Acid Accession tt: im_002658.1 

Coding sequence : 77 . . 1372 

1 11 21 31 41 SI 

I I I I I I 

GTCCCCX3CAG CGCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCX5CCG CCGTCTAGCG 60 

CCCCGACCTC GCCACCATQA OAGCCCTaCT GGCQCGCCTa CTTCTCTGCQ TCCTGGTCGT 120 

GAGOGACTCC AAAOGCACCA ATGAACTTCA TCAAGTTCCA TOBAACrGTO ACTGTCTAAA 180 

TGGAGOAACA TCTGTGTCCA ACAAOTACIT CTCCAACATT CACIGGTGCR ACTGCXX3UUV 240 

GAAATTCSGGA GGGCA6CACT GTGAAATAGA TAAflTCAAAA ACCIGCTAIG AQGOnATGG 300 

TCACTTTTAC C3GAGGAAAGO CCAGCSUITGA CACCATG6GC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCa TGCCCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT GCAGGAACCC AGACAACCGG AGGCGACCCT GOTGCTATQT 4B0 

GCAGGTGGGC CTAAAGCCGC TTGTCCAAGA GTGCATGGTG CATGACTGCG CAGATGGAAA S40 

AAAGCCCTCC TCTCCTCTAG AAGAATTAAA ATTTCAGTGT GGCCAAAAGA CTCTGAGGCC SOO 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATCX5AG AACCAGCCCT GGTTTGCGGC 660 

CATCIACAOO AGGCACCGGG GGGGCTCTGT CACCTACGTG TGTGGAGGCA GCCTCATCAG 720 

CCCTTGCTGG (STGATCAGCG CCACACACTG CTTCATTQAT TACCCAAAQA AGGAOGACTA 7B0 

CATCX3TCTAC CIGGGTCGCT CAAGOCTTAA CTCCAACACO CAAGQGOAQA TGAAGTTTGA 840 

G6TOQAAAAC CTCATCCTAC ACAAGQACTA CAaOGCTOAC ACGCTTGCTC ACCACAACGA 900 

CATTGCCTTG CTGAAGATCC GTTCCAAGGA GGGCAGGTGT GCGCAQCCAT CCCOOACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGGCACAA GCTGTGAaAT 1020 

CACTGGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT COGGAGCSVGC TOAAAATQAC 1080 

TGTTQTGAA6 CTGATTTCCC ACCGGGAGTG TCAGCAGCCC CACTACTACG GCTCrQAAOT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CCAATGGAAA AOWSATTCCT GCC3VGGGAGA 1200 

CTCAGGGGGA CCCCTCGTCT GTTCCCTCCA AGGCOGCATG ACTTTGACTG GAATTGTGAG 1260 

CTGGGGCaST GGATGTGCCC TGAAGGACAA GCCAGGCGTC TACACGAQAG TCTCACACTT 1320 

CTTACCCTGG ATCCGCAGTC ACACCAAG6A AGAGAATCGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CGGGCACCAC CCGCTTTCTT GCTGGTTGTC ATTTTTGCAG TAGAGTCATC 1440 

TCCATCAGCT 6TAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGOAT TT6CCTGTGG 1500 

CACCACCAGQ OTGAAOSACA ATAGCTTCAC CCTCACGGAT AGGCCTGGGT GCTGGCTGCC 1560 

CAOACCCTCT aGCCASGATG GAGGGGT6QT CCTGACTCAA CATGTTACia ACCAGCAACT 1620 

TGTcrrrrrc tcgacioaag cctgcagbaq ttaaaaaggo csvgggcatct cctgtqcatg leso 

GGCTCGAAGO GAGAGCCAGC TCCCCOGACC GGTGGGCATT TGTGAGGCCC ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTGGGAG CAGOSGTTTG GGGAGCAGAG ACACTAACGA CTTCAGGGCR GGGCTCTGAT 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTGG 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGGTCTTTCT GQAGAGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGGGTCCCC CAGGTGACAQ TGCCTGGGAA TGTACTTATT CTQCAGCATG 2100 

ACXrrGTGACC AGCACTGICT CAQTTTCACT TICACATAGA TGTCCCrTTC TTGGCX3USTT 2160 

ATCCCTTCCT TTTAQCCTAG TTCATCCAAT CCTCACTGGG TGGGQT6AGG ACCACTCCTT 2220 

ACaCTGAATA TTTATATTTC ACIA rilTlA TTTATATTTT TGTAATTTTA AATAAAAGTG 2380 
ATCAATAAAA TCTOATTTTT CTGA 
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1 11 21 31 41 51 

I I I t I I 

MHALLAHIiliL CVLWSDSKG SNELHQVPSM CDCLNGGTCV SHOTPasT 

HCEIDKSKTC YEGNGHPYRG KASTimKaiP CLPHNSATVI. QQTVHRHRSD ALQLGLGKBN 120 

YCHNPDNRRR PWCYVQVGLK PLVQECSWHD OUXjKKPSSP PEELKFQCGQ KTLRPRFiai X80 

GGEFTTIEWQ PWFAAIYRRH RGGSVTYVOG GSLISPCHVI SaTHCPIDYP KKEOYZVYLO 240 

RSRUISNTQG EMKFEVENI.I LHICDYSAOTL AHHNDIALLK IRSXBQRCAQ PSRT IQTIC Ii 300 

PSMYNDPQFG TSCEITGFGK ENSTDYLYPB QUa4TVVia.I SHRBOQQPHY YGSBVTTKMI. 360 

CAADPQWKTD SCQGDSGGPL VCSLgGBKTI. TGIVSWGSGC HJCDKPaVYT RVSHFIiPHIR 420 
SHTKEENGIA I. 

Seq ID NO: 616 DNA sequence 

Nucleic Acid Accession tt: NH_024422.1 

Coding sequence ■ 202.. 2907 

1 11 21 31 41 51 

I I I I I I 

CGCCAAAGGA AAAGCCCCTT GQATGAGAGG CW3G0GCTTC AGAGAAGCTA AGAAAAGCAC 60 

CTCTCCGOGC GCCCCACCTC CTCCGCTTCG CX3CTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCTCXX3GCXS3 CGGCCCTCGC CCCGOGGAGC CCTCCTACCC OGGCCOGACG CTCGGCCGGC 180 

QACCTGCCCC GAGCCCTCTC CATGGAQGCA GCCCGCCCCT COQGCTCCTG GAACGGAGCC 240 

CTCTGCCGGC TGCTCCTGCT OACCCTCGCG ATCTTAATAT TTGCC ASTG A TGCCTGCAAA 300 

AAT8TOACAT TACATGTTCC CTCCAAACTA. GATQCOSAGA AACTTGTTGO TAGAGTTAAC 360 

CTGAAAGAOT GCTTTACAQC TGCAAATCTA ATTCATTCAA GTGATCCTOA CTTCCAAATT 420 

TTGOAaGATO GTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGQA GAA GAGAAG T 480 

TTTACCATAT TACTTTCCAA CACTGAGAAC OUVGAAAAGA AGAAAATATT TGTCTTTTTG 540 

GA6CATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCGCGCC 600 

AAQAGAAGAT GGGCTCCAAT TCCTTGTTCG ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTrrXCCTTC AACABQTTCA ATCIGACACG GCCCAAAACT ATACCATATA CTATTCCaTA 720 

AGAGGTCCTG GAGTTGACCA AOAACCTCSS AATTTATTTT ATGTQQAGAQ AGACACTGGA 780 

AACTTOTATT GTACTCGTCC TCTAGATOGT GAGCBGTA3X! AATCTTTTGA GATAATTGCC 840 

TTTGC3\ACAA CTCCAGATGG GTATACTCCA GAACTTCXAC TGCCCC TAAT AATCAAAATA 900 

GAGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAGAAA CTTATACTTT TAOATTTTT 960 

GAAAATIGCA GAGTCGGCAC TACTGTGGGA CAAGTGTOTG CTACTGACAA AGATGAGCCT 1030 

GACACQATGC ACACACXKXrr GAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCaCC 1080 

CTATTTTCTA TGCATCCAAC TACAGGCGTG ATCACCACAA CATCATCTCA GCTAGACRGA 1140 

GAGTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAGA CAACTTCSUIC TTQTATOITT AACATTGATG ATGTAAATQA CCACTTGCCA 1260 

ACATTTACTC GTACTTCTTA TOTGACATCA GTGGAASAAA ATACAQTTGA TQTGGAAATC 1320 

TTACX3ASTTA CI6TTGAGGA TAAGGACTTA GTOAATACIG CTAACTQGAG AGCTAATTAT 1380 

ACCATTTTAA AGGGCAATGA AAATGQCAAT TTTAAAATTG TAACAGATOC CAAAACCAAT 1440 

GAAGGAGTTC TTTQTGTAGT TAAGCCTTTG AATTATGAAG AAAACCAACA GATOATCTTO ISOO 

CAAATTGGTG TAGTTAATGA AGCTCCATTT TCCSVQAGAGG CTAGTCCAAG ATCAGCCATG 1560 

AGCACAGC3VA CAGTTACTGT TAATGTAGAA GATCAGGRTG AGGGCXZCTGA GTGTAACCCT 1620 

CCAATACAGA CTGTTCGCAT GAAAGAAAAT GCAGAAGTGG GAAC3UVCAAG CAAXGGATAT 1680 

AAAGCATATG ACCCM3AAAC AAGAAGTAGC AGTGGCATAA GGTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GGGTCACCAT TGATGAAAAT ACAGGATCAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAQACCAT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC 1860 

CAAGGAGGGA GAACRTGTAC GGGGACACTG GGCATTATAC TTCAAQACGT GAATGATAAC 1920 

AGCCCaTTCA TACCTAAAAA GACAGTGATC ATCTQCaAAC C CACC ATGTC ATCTGCGGAG 1980 

ATTGTTGCGG TTGATCCTCA TGAfiCCTATC CATGGCCCAC CXTrTTQACTT TAGTCTQGAG 2040 

AGTTCTACTT CAGARGTACA GAOAATGTGG AGACTGAAM3 CAATTAATOA TACAGCKGCA 2100 

CGTCTTTCCT ATCAGAATGA TOCTCCATTT QGCTCAZATG TASTACCTAT AACASTGASA 2160 

GATAGACTTG GCATGTCTAO TGTCACTTCA TTGOATGRA CACTGTGTGA CTGCATTAOC 2230 

GAAAATGACT GCACACATCG TGTAQATCCA AGGATTGQC6 GTGGAGGAGT ACAACTTGQA 2280 

AAGTGGQCCA TCCTTGCAAT ATTGTTGGGC ATAGCaTTOC TCTTTTGCAT CCTGTTTACG 2340 

CTGGTCTGTG GGGCTTCTGG GACGTCTAAA CAACX»AAAO TAATTCCTGA TGATTTAGCC 2400 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAAGT GTATTCTGC3G 2460 

AATGGCTTCA CAACCCAAAC TGTGGGCGCT TCTGCTCAGG GAGTTTGTGG Cf^CartOBOt^. 2520 

TCAGGAATCA AAAACGGAGG TCAGGAGACC ATOQAAATOQ TGAAAGGAGG ACACCAGACC 2580 

TaOOAATCCT GCCGGGGGGC TGGCCACCAT CACACCCIGG ACT CCTGC AG GGGAGGACAC 2640 

AOSaAeSTBQ ACAACrOCAO ATACACTTAC TOGQASTGGC ACJKJTTTIAC TCAQCCCrCT 2700 

CTTGSTGAAA AAGTGTATCT GTGTAAICAA GATGAAAATC ACAAOCATGC CCAAGACTAT 2760 

QTCCTOACAT ATAACTATOA AGGAAGAQOA TCGGTQQCTG GGTCTGTAGG TTGTTGCAGT 2820 

GAACGAC3VAG AAGAAGATGG GCTTQAATTT TTOQATAATT TGGAGCCXaA ATTTAGGACA 2880 

CTAGCAGAAG CATGCATGAA GAGATGAGTG TGTTCTAATA AGTCTCTGAA AGCCAGTGGC 2 940 

TTTATGACTT TTAAAAAAAA TTACAAACCA AGAATTTTTT AAAGCAGAAG ATGCTATTTG 3000 

TGGGGGTTTT TCTCTCATTA TTTGGATGGA ATCTCTtTGG TCAAATGCAC ATTTACAGAG 3060 

AGACACTATA AACAAGTACA CAAATTTTTC AATTTTTACA TATTTTTAAA TTACTTATCT 3120 

TCTATCCAAG GAGGTCTACA aAGAAATTAA AQTCTGCCTT ATTTGTTACA TrTGCGTATA 3180 

ATGACAACAG CCAATTTATA GIGCAATAAA AXGTAATCAA TTCAAGTCCT TATTATAQAC 3240 

TATTTGAAOC ACAACCTAAT GGAAAATTQT AGAGACCrrG CTTTAACATT ATCTCKAOTT 3300 

AATTAAGTGT TCATGTGGTG CTTGOAAACT GTTGTTTTCC T(»ACATCTA AAGTGTGTAO 3360 

ACTGCATTCT TGCTATTATT TTATTCTTGT AATGTQACCT TTTCACTGTG CAAAGGOAOA 3420 
TTTCTAGCCA GGCATTOACT ATTACAATTT CATT 



8: NP_077740.1 



AHLIHSSDPD FQILEDGSVY TTNTILIiSSB KRSFTIIiLSir TSIQBKXKIF VFIiBHQTKVl. 
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KKHHTKEKVL RRAKRRWAPI PCSMLEHSLG PPPIjFLQQVQ S0TAQNYTIY YSIRGPGVDQ 180 

EPRMLFYVER DTGNLYCTRP VDREQYESFE IIAFATTPDG YTPELPLPLI IKIEDENDNY 240 

PIFTEETYTF TIFENCRVGT TVGQVCATDK DEPDTMHTRL KYSIIGQVPP SPTLFSMHPT 300 

TGVITTTSSQ LDREIilDKYQ LKIKVQDMDQ QYFGUQTTST CIIHIDDVND HIiPTFTHTSY 360 

VTSVEENTVD VEILRVTVED KDLVNTAMHR MtTTlUOWE traKFKIVTDA KTNBGVLCW 420 

KPLNYEBKQQ MILQIGWNB APFSREASPR SAMSTATVTV NVEDQDEGPE CMPPIQTVRM 480 

KEMAEVGTTS NGYKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLDREAETI 540 

KKGIYNITVL ASDQGGRTCT GTLGIILQDV NDNSPPIPKK TVIICKPTMS SAEIVAVDPD 600 

EPIHGPPFDP SLESSTSEVQ RMWRLKAIND TAARLSYQUD PPFGSYWPI TVRDRLGMSS G60 

VTSUJVTLCD CITESDCTHR VDPRIGGGGV QliGKHAIUa liGIALLPCI LFTLVCGASG 720 

TSKaPKVIPD DLAQQHI.IVS NTEAPGDOKV YSAHGFTTQI VQASAQ6VGG TVGSGIXNGG 780 

QETIEKVKS6 HQTSESCROA GRRRTUSSCR OQHTBVIHICR YTYSEMHSFT QPRLGEKVYI. 840 

CNQDENHKHA QDYVIiTYKYB GRGSVAOSVG CCSBRQEEDG LEFLDNLEPK FRTIAEACMK 900 
R 

Seq ID NO.- 618 DNA sequence 

Nucleic Acid Accession «: NM 004949.1 

Coding sequences 202..274S 

1 11 21 31 41 51 

I - I I I I I 

CGCCAAAGGA AAAGCCCCTT GGATGAGAGG CAGGCGCTTC AGAGAAGCTA AGAAAAGCAC 60 

CTCTCCGCGC GCCCCACCTC CTCCGCCTCG OGCTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCrCCGGCCG CGGCCCTCGC CCCGCGGAGC CCTOCTACCC OGGCCCOAGG CTCGGCCQQC 180 

GACCTGCCCC GAGCCCTCTC CATOQAQGCA GCCCGCOOCT COOGCTOCIG GAAOQGAGOC 240 

CTCTGCOXK: TGCTCXTTGCT GACCCTCGCO ATCTTAATAT TTGCCAGTQA TCCCTGCAAA 300 

AATGTGACAT TACATGTTCC CTCXaUVACTA GATOCCGASA AACTTGTTGQ TAOAQTTAAC . 360 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

TTGGAGGATG GrTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGGA GAAGAGAAGT 480 

TTTACCaTAT TACTTTCCAA CACTGAQAAC CAAGAAAAGA AGAAAATATT TGTCTTTTTG 540 

GAGCATCAAA CAAAGGTCCT AAAOAAAAGA CATACTAAAG AAAAAGTTCT AAGQ0QC3GCC 600 

AAGAGAAGAT GGGCTCCAAT TCCTTGTTOS ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTTTTCETTC AACAGGTTCA ATCTGRCAOG GCCCARARCT ATACCATATA CTATTCCATA 720 

AGAGGTCCTG GAGTT6ACCA AGAACCTCGG AATTTATTTT ATGTOQAGAG AQACACIGaA 780 

AACTTGTATT GTACTCGTCC TGTACATCGT GAQCAOTATO AATemrTOA GATAATTGCC 840 

TTTGCAACAA CTCCAQATGa OTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

GAGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAQAAA CTTATACTTT TACAATTTTT 960 

GAAAATTGCA GAGTGGGCAC TACTGTGGGA CAAGTGTGTG CTACTGACAA AGATGAGCCT 1020 

GACACGATGC ACACACQCCT QAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCACC 1080 

CTATTTTCTA TGCATCC34AC TACAGGOGTG ATCACCACSiA CMCATCTCA GCTAGACAQA 1140 

GAGTTAATTG AOVAGTACCA GTTGAAAATA AAA6TACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAQA CAACITCAAC TTBTATCATT AAC»TTQATQ ATQTAAAXGA CCACTTOCCA 1260 

ACATTTACTC QTACTTCTXA TGTOACATCA GTGGAnSAAA ATACAGTTGA TOTOSAAATC 1320 

TTACQAQTTA CraTTQAiOGA TAAGOACrTA GIQAATACTS CTAACTGOAa AGCTAATTAT 1380 

ACCATTTTAA AGGGCAATQA AAATGGCAAT TTTAAAATTO TTVACAQATQC CAAAACC3VAT 1440 

QAAGGAGTTC TTT6TGTAGT TAAGCCTTTG AATTAT6AAQ AAAAGCAACA GATGATCTTC 1500 

CAAATTGGTQ TAGTTAATGA AGCTCC3VTTT TCCAQAGAGG CTAGTCCaAG ATCaGCCATG 1560 

AGCACAGCAA CAGTTACTGT TAATGTAGAA QATCAGGATG AGGGCCCTGA OTGTAACCCT 1620 

CCAATACaGA CTGTTCGCaVT GAAAGAAAAT GCAGAAGTGG GAACAACAAG CAATGGATAT 1680 

AAAGCATAT6 ACCCAGAAAC AAGAAGTAQC AQTQGCATAA GOTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GGGTCACCRT TGATQAAAAT ACAGGATCAA TCAAAGTTTT CASAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATOGC ATATATAATA TTACBOTCCT TOCATOWSAC 1860 

CAAGGAGGGA 6AACATGTAC GGGGACACTC GGCATTATAC TTCAAOACQI GAATGATAAC 1920 

AGCCCATTCA TACCTAAAAA QACAGTGATC ATCTGCaUVAC CCACCATGTC ATCTGCX3GAG 1980 

ATTCTTGCGG TTGATCCTGA TGAGCCTATC CaiTGGCCCAC CCTTTCACTT TAGTCTGGAG 2040 

AOTTCTACTT CRGAAGTACA GAGAATGTGG AGACTOAAAG CAATTAATGA TAOVGCAQCA 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATG TAGTACCTAT AACAGTQAGA 2160 

GATAGACTTG GCATQTCTAQ TGTCACTTCA TTGGATGTTA CACTGTGTGA CTGCATTACC 2220 

GAAAATGACT GCHChCKICG T6TAGATCCA AGGATTGGGG GTGGAGGAGT ACAACTTGGA 2280 

AASTGGGCCA TCCTTGCAAT ATTGTTGQGC ATAGCATTGC TCTTTTGCAT CCTGTTTACG 2340 

CT G OTCTBT G GGGCTTCTGO OAOOTCTAAA CAACCAAAAG TAATTCCIGA TQATTTAQCC 2400 

CAGCAOAACC TAATTOTATC AAACACAGAA GCTCCrOGAa ATOACAAAGT GTATTCIGCG 2460 

AATOOCTTCA CAACCCAAAC TGTGGGOGCT TCTGCTCAQG aAGTTTCTGG CACCOrGGOA 2520 

TCAGGAATCA AAAAOGGAGG 7CAGGAGACC ATOGAAATGG TGAAA6GAGO ACACX»GACC 2580 

TCGQAATCCT GCCGGGGGGC TGGCCACXa.T CACACKCTGG ACTCCTGCAfi QGGftGGACAC 2640 

ACXXSAGGTGQ ACAACTQCAG ATACACTTAC TCGGAGTGGC ACAGTTTTAC TCAGCCCOGT 2700 

CTTGGTGAAG AATCCATTAG AGGACACACT CTGATTAAAA ATTAAACAAT OAAAGAAAGT 2760 

GTATCTGTGT AATCAAGATG AAAATCACAA GC3VTGCCCAA GACTATGTCC TGACATATAA 2820 

CTATGAA<KA AGAQGATCGG TGGCTGaGTC TaTAGGTTGT TGCAOTGAAC QACAAGAAGA 2880 

AGATGGGCTT GAATTTTTGG ATAATTTGGA GCCCAAATTT AGGACACTAQ CAGAAGCATG 2 940 

CATGAAGAGA TGAGTGTGTT CTAATAAGTC TCTGAAAGCC AGTGGCTTTA TGACTTTTAA 3000 

AAAAAATTAC AAACXIAAGAA TTTTTTAAAG CAGAAGATGC TATTTGTGGG GGTTTTTCTC 3060 

TCATTATTTG GATGGAATCT CTTTOGTCaW. ATGCACATTT ACAGAGAGAC ACTATAAACA 3120 

AGTACACAAA TTTTTCAATT TTTACaiTATr TTTAAATTAC TTATCTTCTA TCCAAGGAGG 3180 

TCTACAGAGA AATTAAAGTC TGCCTTATTT GTTAC3VTTTG OGTATAAXOA CAACnaCX»A 3240 

•tTTATAGTGC AATAAAATGT AATTAATTCA AGTCCTTATT ATAGACtATT TGMGCACAA 3300 

CCTAATGGAA AATTGTAGAG ACCTTQCrTT AACATTATCT CCAGTTAATT AAGTGTTCAT 3360 

GTGGTQCTTG GAAACTGTTQ TTTTCCTGAA CATCTAAAGT GIGTAGACT6 CRTTCITGCT 3420 

ATTATTTTAT TCTTGTAATO TGACCTTTTC AdQIQCAAA GGGAOATTTC TA6CX»GGCA 3480 
TTGACTATTA CAATTTCATT 

Seq ID NO: 619 Protein sequence 
Protein Accession ft: NP_004940.1 

1 11 21 31 41 SI 

I i I I I I 

MBAARPSOSW NQALCRI1I.U. TLAIIiZFASD ACKHVTIiIIVP SXLSAEKItVO RVHUOSCFTA 60 
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ANLIHSSDPD FQILEDGSVY TTHTILLSSB KRSFTILI.SN TENQBKKKIF VFIjEHQTKVL 120 

KXRHTKEKVL RRAKRRWAPl PCSMLENSLG PFPLFLQQVQ SDTAQNYTIY YSIRGPGVDQ 180 

EPRNIiFYVER DTGNLYCTRP VDREQYESPE IIAPATTPDO YTPELPLPLI IKIEDENDNY 240 

PIFTEBTYTF TIFEHCRVGT TVC3QVCATDK DBPDTMHTRI. KYSIICSQVPP SPTIiFSMHPT 300 

TGVITTTSSQ LDRELIDKYg LKIKVQDMDQ QYFGIiQTTST CIINIDDVND HLPTPTRTSY 360 

VTSVEBHTVD VEILRVTVED KOLVNTANHR ANYTHiKOHE KGNPKIVISA KXMEOVIiCW 420 

KPUIYEEKOQ MILQIGWNB APFSREASPR SAMSTATVTV NVEDQDE6FE OIPPIQTVRM 4B0 

KENAEVGTTS MGYKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLOREAETI S40 

KHGIYNITVL ASDQGGRTCT GTLGIILfflV NDHSPFIPKK TVIICKPTMS SAEIVAVDPD 600 

BPIHGPPFDF SLESSTSEVQ R»IWRI.XAIND TAAHLSYQND PPFGSYWPI TVBDRL04SS 660 

VTSIiDVTLCD CITQIDCTHR VDPRIQGGGV QLGKWAILAI LLGIALLFCI LFTLVCGASG 720 

TSKQPKVIPD DLAQQNLIVS MTEAPGDDKV YSAHGFTTQT VGASAQGVCO TVGSGIKMGG 780 

QETIEMVKQG HQTSESCRGA GHHBTIiOSCR GGHTEVDIICR YTYSEWRSFT QPRI.6EESIR 840 
QHTIiIKN 

Seq ID HO I 620 DNA sequence 

nucleic Acid Accession «: MM_032545.1 

Coding sequence: 46.. 718 

1 11 21 31 41 SI 

I I I I I I 

AAACTGATCT TCAATGCACT AAGAGAAGGA GACTCTCAAA CCAAAAATGA CCTGGAGGCA 60 

CCATGTCAGQ CTTCTQTTTA CGGTCAGTTT GQCATTACAa ATCATCAATT TGOGAAACAG 120 

CTATCAAAOA GASAAACATA A0Q6C6GTA0 AGAGOAAGTC ACCAAGQTTG OCACTCASAA 180 
OCAOCOACAO TCACOGCTCA ACTGGACCTC CAGTCATTTC GGAQAG6TQA C 



CGOGCGG(XG CGCTGCTGCA GGAACGGOOG TACTTGCXSTG CTGGGCAGCT TCTGOGIGTG 
CKCGGCCCAC TTCACCJOOCC OCTACTGOGA GCATGACCAG AGGOGC3W3TG AATGCGGOGC 
CCTGOAGCAC GGAGCCTGGA CCCTCCGCGC CTGCCACCTC TGCAGOTGCA TCTTCGGOGC 
CCTGCACTGC CTCCCCCTCC AGACGCCTGA COGCTGTGAC CCGAAAGACT TCCTGGCCTC 
CCACGCTCAC GGGCCGAGCG CCGGGGGOGC GCCCAQCCTQ CTACTCTTGC TGCKCTGCGC 
ACTCX:TGCaC CGCCTCCTGC GCCCGGATGC GCCCGCGCAC CCTCGGXCCC TGQTCCCTTC 

cxmxrrccAS cxsgqagoggc gcccctgosg aasgocgbga cttgggcatc gcctttaatt 

TTCTATGTTG TAAATAATAG ATGTCTrTTAO TTTAOOGTAA GCTOAAeCAC TGOOIGAATA 
TTTTTATTOa GTAATAAATA TTTTCATOAA AGOQCCAAAA AAAAAAAAAA AAAAAAAAAA 



25 



35 



40 

1 11 21 31 41 SI 

I I I I I i 

MTHRBBVRLL FTVSLALQII NI.GHSYQREK BNGGREEVTK VATQKHRQSP mW TaSH FGB 
VTCSAEGROP EEPLPYSRAF GBGASARPRC CRNGCTCVUS SFCVCPAHFT GRYCEHDQRR 
45 SECGALBRGA WTLRAaOiCR CIFGAIBCLP LQTPDRCDPK DFUVSBAH6F SAGGAPSbLL 
UiPCALLBRIi LRPOAPAHPR SLVPSVI.QRG RRPCGRPGIiO HRL 

Seq ID NO) 633 DNA sequence 
Nucleic Acid Accession (' — 
50 Coding sequence: 1..390 

1 11 21 31 41 51 

I I I I I I 

ATGAGGTTCA GTGTCTCAGG CATGAGGACC GACTACCCCA GGAGTGTGCT GGCTCCTGCT 

55 TATOTCTCAQ TCTOTCTCCT CCTCrTOTGr CCAAGGGAAG TCATCGCTCC CGCTGGCTCA 
GAACOITGGC TGTGCCAaOC GGCACCCAiaa TOTOOAOACA AGAICCACAA CCCCTTSQAG 
CACrraCTBTT ACAATOAOQC CATCGTGTCC CTQAOCSAGA CCCGCCA ATO TGGTCCCCCC 
TQCACCTTCT GGCCCTQCTT TGAGCTCTGC TQTCTTQATT CCTTT6GCCT CACAAACQAT 
TTTGTTGIGA AGCTGAAOGI TCAGGOTGTa AATTCCCAQT GCCACTCATC TCCCATCTCC 

60 AQTAAATQTG AAAGAGGCOS. GAXATQTTAG 



I I I I I I 

MKFSVSCaaiT DYPRSVI<APA YVSVCIiIiIiLC PRBVIAPAOS EPMLCQPAPR GGDtUYHPIf 
QCCYNDAIVS LSETRQCGPP CTFWPCFELC CLDSFGLTUD FWKLKVQGV NSQCKSSPIS 
SKCERGRIC 

Seq ID NO: 624 DNA sequence 
Nucleic Acid Accession »: M18728.1 
Coding sequence: 51..10BS 

75 1 11 21 31 41 51 

I I I I I I 

GGAQCTCAAG CTCCTCTACA AAGAGGTGGA CAQAOAAGAC AGCA6AGACC ATGGGACCCC 
CCTCAGCCCC TCCCTGCAfiA TTGCATQTCC CCTQQAAGGA GGTCCTGCTC ACAGCCTCAC 
TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATT6AATCC AOGCCATTCA 

80 ATGTOGCAGA GGGGAAGGAG GTTCTTCTAC TCQCCCACAA CCTGCCCCAQ AATCGTATTG 
GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTQTA GGATATGTAA 
TAGGAACrCA ACAA6CTACC CCAGGGCCCG CATACAGTGG TCGAGAGACA ATATACCCCA 
AXGCATCCCT GCTGATCCAG AACQTCACCC AOAATOACAC AGGATTCTAT ACCCTACAAG 
TCATAAAGTC AGATCTTOTG AATOAAGAAG CAACCGGACA OTTCCATOTA TACCCGGAGC 

85 TGCCCAAGCC CTCCATCTCC AQCAACAACT CCAACCCOST QGAGSACAAO GATGCTGTGG 
CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT OTGOIGGGTA AATGGICAGA 
GOrrOCCGGT CAGTCCCAG6 CI6CAGCI6T CCAATGGCAA CATOACCCTC ACTCTACTCA 
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GCX3TCAAAAG GAACX3ATGCA 
ACCGCAGTGA CCCAGTCACS: 
CCTCAAAGGC CAATTACCGT 
ACCCACCTGC ACAGTACTCT 
TCTTTATCXrC CAACRTCACT 
CAGCCACT6G CCTCAATAGG 
TOCICICACC TQTQGOCACC 
TATAGCAGCC CTGOTGTATT 
OAATTCTTCT AOCTCCTCCA 
CTCTGCTCXrr GAAGCCCTAT 
ACCCTCftGGC CTGAGGTGTG 
GCAAACCATG GTGAGAAATT 
AACAA GACT C CTCATCATGA 
TGCCTCTTTC GCTTGGCAGO 
GGGTAACTTA ACAiQAGTQTC 
AOATCCTTTA GTGCACCCAG 



GGATCCTATG 
CTGAATGTCC 
CCAGGGGAAA 
TGGTTTATCA 
GTGAATAATA 
ACCACAGTCA 



AATGTGAAAT 
TCTATGGCCC 
ATCTGAACCT 



ACAGAACXXA 
AGATGTCCCC 
CTCCTGCCAC 



GCGGATCCTA 1 



TTTAATTCAA 
GGAGQAGTCT 
GCTGAOACTA 
CTQACTCATT 
CTCTTGGTAT 
CTCTAAAAGC 
GQCIGQAATT 



CCCAGCCATG 
GTGCAGTTTG 
AOTTGTAGAA 
CTTTATTCTA 
TACCCTCCTA 
TTTAAATGTC 
ACAAAACTCA 



GACGACTTCA 
TAAGGCTCTT 
ATQAT6CT0T 
AGATCTATCr 
TGACTGACAT 
CAGAGTTGGA 
CAATGCCAAA 
TGACACTTGT 
ATTAACaUVAT 
TTTTAGTTGG 
ATAGTCATAC 



TGGACAACTC 
AGACTTCACC 
CaCTATGGAC 
ACCCCCTTTT 
CATTAGTATT 



ACCACTAAAA 
AATGAAAATT 
TAACTAGAGA 
AGCTTTTCCC 
AATTTGTCCT 
TCACAAGAAG 
AACQTTTTAC 



GCGAGTGCCA 
ACCATTTCCC 
GCWGCCTCTA 
ACACAAGAGC 
GCCCATAACT 
AGTGCTCCTG 
OTOGCTCTSA 
ACCAGACCCT 
ACAAGGTCTQ 
TAAAGGGAAA 
CAGTCAAACT 
AAGATGTCAA 
TGCTTATGCC 
TAGCTTCAGA 
ATAAAATAAO 
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CACCTGTTCT 
GCKXCTACC 
GGCTAAATAC 
GTTAAAATGG 
GCCTAAGGTG 
ACTCCCTGGT 
ATASTGAATG 
AACATCATAA 
GCTTTAAGAT 

3 0TGA60GCAT TGAGCCAQTG GrTGCTAAATG CTACATACTC 
CAATTTAAAA 
AAGTCCCCTC 
TGTATTTATT 
ATTGTATTGC 



TGTTGAACAT 
GTGCTGCTTG 
TTTGTATCTT 
TAGTAGTCAT 
CAGCCATCAA 
TCATCAGGAG 



CACTCCCTGT 
AGCTOAACAG 
AATGGGTATC 
CTACACTCAT 
CGTAGTCCAA 



TTTACAAAAA AGTAACCTGA 
TGTTTCCTTG TTCCAATTTG 
CTATCACTGT ACTTGTAGAa 
TAGCTCTATA ACT 



CAAT TAAAAA 
CTTGAOTTA6 
ACTAATCTGA 



CATAA1ACAS 
TGTTAACCAA 
CTGTTCTTGT 
TTAATTCATA 



TTGGTCACAC 
CAACIOAAAT 
AAAAAAAAGA 
TACTTT AACT 
TCTQTQGTTC 



AAAAGCCAAT 



1020 

loao 

1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



1920 
1980 
2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2S20 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



31 



41 



111-. 
MGPPSAPPCR LHVPWKEVLL TASLLTFMNP PTTAKLTIES TPFNVAEGKE VliLLAHNLPQ 
HRIGYSWYKG ERVDGNSI.IV GYVIGTQQAT PGPAIfSGRET lYPNASIiLIQ NVTQNDTGPY 
TIKJVIKSDI.V HEBATGQFHV YPELPKPSIS SNNSNPVEDK DAVAPTCEPE VONTTyr.WWV 
MGQSIiPVSPR LQLSNiXIMTI. TIjLSVKRNDA GSYBCEIQAP ASAKRSDPVT UIVLYGPDVP 
TISPSRAHYR PGENLHLSCB AASH9PAQYS WFIMSTFQQS TQBIiPIPMIT VMMSGSyMOQ 
AHHSATCUIR TTVTMITVSG SAPVI.SAVAT VGITIGfVIiAR VALI 

Seq 10 HO I £2€ DNA sequence 
Nucleic Acid Accession U M1872e . 1 
Coding sequence: 13SS..1657 



GGAGCTCAAG 
CCTCAGCCCC 
TTCTAACCTT 
ATOTCGCAGA 
GTTACAGCTG 
TAGGAACTCA 
ATGCATCCCT 
TCATAAAGTC 
TGCCCAAGCC 



GCCrCCCGGT 
GCGTCAAAAG 
ACCGCAGTGA 
CCTCAAAGGC 
A CCCAC CTGC 
TCTTTATCCC 



TCCTCrCAGC 1 



TQCCTCTTTC 
GGGTAACTTA 
AGATCCTTTA 
AAAT6TACM3 
TTTAATTCAft 



11 
1 

CTCCTCTACA 
TCCCTGCAGA 
CTGGAACCCA 
GGGGAAGGAG 
GTACRAAGGC 
ACAAGCTACC 
GCTGATCCAG 
AGATCTTGTG 
CTCCATCTCC 
TGAACCTGAG 
CAGTCCCAGG 
GAACGATGCA 
CCCAGTCACC 
CAATTACCGT 
ACAGTACTCT 
CAACATCACT 
CCTCAATAGG 
TGTGGCCACC 
CTGGTGTATT 
AGCTCCTCCA 
GAAGCCCTAT 
CTGAGGTGTG 
GTGAGAAATT 
CTCATCATGA 
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41 



I I 
AAGAGGTGGA CAGAGAAGAC AGCAGAGACC 
TTGCATGTCC CCTGGAAGGA GGTCCTGCTC 
CCCACCACTG CCAAGCTCAC TATTGAATCC 
GTTCTTCTAC TOGCCGACAA CCTGCCCCAG 
GAAAQAGTGG ATGOCAACAG TCTAATTGTA 



51 



AAGGICACCC 
AATGAA6AAG 
A6CAACAACT 
GTTCAGAACA 
CTGCAGCTGT 
GGATCCTATG 



I 

ATGGGACCCC 
ACAGCCTCAC 
ACQCCATTCA 
AATOGTATTG 
GGATATGTAA 
ATATACOCCA 



CCAGGGGAAA 



CCAACCCCQT 
CAACCTACCT 
CCAATGGCAA 
AATGTGAAAT 
TCTATGGCCC 
ATCTGAACCT 



GTGGTGGGTA 
CATQACCCTC 
ACAGAACCCA 
AGATGTCCCC 
CTCCTGCCAC 
CCAOCAATCC 



TTCQATATTT CAOGAAOACT GGCAGATTGG 
ATCCCATTTT ATCCCATGGA ACCACTAAAA 
ATGCTGGAGA TGG.ACAACTC AAT(3AAAATT 
TGCCACTCAG AGACTTCACC TA ACTAG AGA 
GACQACTTCA CACTATGGAC AGCTTTTCCC 



GATGCTGTGG 
AATGQTCAGA 
ACTCTACrCA 
GCGAGTGCCA 
ACCATTTCCC 



ATGATGCTGT C 



GTGCAGTTTC 
AGTTGTAGAA 
CTTTATTCTA 
TACCCTCCl 



TGACTGACAT 
CAGAGTTGGA 
CAATGCCAAA 
TGACACTTGT 
ATTAACAAAT 
TTTTAOTTOO 
ATAQTCATAC 



TGTCAATCCC 
TAGCAGCATC 
CTTCTAGACT 
TAATA<»ATT 
TGTTGAACAT 
GTGCTG CTTG 
TTTGTATCTT 
TAGTAGTCAT 



AACGTTTTAC 
TTTAACACAG 
CACCTGTTCT 
GCTCCCTACC 
GGCTAAATAC 
GTTAAAATGG 
GCCTAAGGTG 
ACTCCCT^JT 



GCCCATAACT 
AGTGCTCCTG 
GTGGCTCTGA 
ACCAGACCCT 
ACAAGGTCTQ 
TAAAGGGAAA 
CAGTCAAACT 
AAGATGTCAA 
TGCTTATGCC 
TAGCTTCAGA 
ATAAAATAAQ 
COGTGTGTTC 
CACTCCCTGT 
AGCTOAACAG 
AATGGGTATC 
CTACACTCAT 
CGTAGTCCAA 



: TTTAAATGTC TGCAXGCAGC CAGCCATCftA ATAGTGAATQ GTCrCTCTTT 
r ACAAAACTCA GAGAAATGIO TCATCAGGAG AACATCATAA CCCATGAAGO 
: CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 

1800 
1860 
1920 
1980 
2040 
2100 
2160 
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TCTCACCTAQ OTGAGOSCAT TGAGCCAQTG GTQCTAAATC5 CTACATACTC CAACTGAAAT 
GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 
ACACAGGAGA TTCCaGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 
TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 
TGTTTCCTTG TTCCAATTTO ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGGGAG 
CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCAOVAAT AAAAGCXAAT 
TAGCTCTATA ACT 

Seq ID NO.- 



2400 
2460 
2S20 



PCT/US02/12476 



15 
20 



I I I I t I 

MDSFSQDVKT RLl,IMIRI.t.P PPNI^LLMPA SPAHQDDAVI SISQEVASEG lll.TECQnfI.V 
NPNViHKIRD PLVHPVTDIS SIEWTAVCSM VQMSFSEUJF 

Seq ID KO: 628 DNA sequence 
Nucleic Acid Accession ft: Mie728.1 
Coding sequencei 2370.. 2S0X 



25 
30 
35 
40 
45 
50 
55 
60 
65 



GQAGCTCAAG 
CCTCAfiCCCC 
TTCTAACCTT 
ATGTCGCAGA 
GTTACAGCTG 
TAGGAACTCA 
ATGCATCCCT 
TCATAAAGTC 
TGCCCAAGCC 
CCTTCAOCTG 



CTCCTCTACA 
TCCCTQCAJJA 
CTGGAACCCA 
GGGGAAGGA6 
GTACAAAGGC 
ACAAGCTACC 
GCTGATCCAG 



AAGAGGTGGA CAOAGAAGAC AOCAG»QACC ATGGGACCCC 
TTGCATGTCC 
CCCACCACTG 
GTTCTTCTAC 
GAAAGAOTGG 
CCAGGGCCCG 
AACGTCACCC 



AGCAACAACT 
GTTCA6AACA 
CT6CAGCTGT 



ACOQCAGTGA 
CCTCAAAGGC 
ACCCACCTGC 
TCTTTATCCC 
CAGCCACTGa 
TCCTCTCAGC 
TATAGCAGCC 
GAATTCTTCT 
CTCTGCTCCT 
ACCCICAGGC 
GCAAACCAT6 
AACAAGACTC 



AGATCCTTTA 
AAATGTACA6 
TTTAATTCAA 



CAATTACCOT 
ACAGTACTCT 
CAACATCACT 



TGTGGCCACC 
CTGQTGTATT 
AOCrCCTCCA 



CCAGGGOAAA 
TGGTTTATCA 
GTGAATAATA 
ACCACAGTCR 
GTCGGCATCA 
TTCGATATTT 
ATCCCATTTT 



ATGGCAACA6 
CATACAGTGG 
AGAATGACAC 
CAACCGGACA 
CCAACCCCGT 
CAACCTACCT 
CCAATGOCAA 
AATQTOAAAT 
TCTATGGOCC 
ATCTOAACCT 
ATGGQAOOTT 
GCGGATCCTA 
CGATGATCRC 
CGA-rrGQAGT 
CAGGAAGACT 



TATTO AATCC 
OCTGCCCCAO 
TCTAATTGTA 



GTTCCATGTA 
GQAGGACAAG 
GTGGTCGGTA 
CATGACCCTC 
ACAGAACCCA 



AOGCCATTCA 
AATOGTATTO 
GGATATGTAA 
ATATACCCCA 
ACCCTACAAQ 
TACCCGGAGC 
GATGCTGTGG 
AATGGTCAGA 
ACTCTACTCA 



CTCCTGCCAC 
CCAGCAATCC 
TATGTGCCAA 
AGTCTCTGGA 
GCTQGCCAGQ 



ACCATTTCCC 
GCAGCCTCTA 
ACACAAGAQC 
GCCCATAACT 
A«3TGCTCCTG 
GTGGCTCTGA 
ACCRGACCCT 
ACAAGGTCTQ 



CTGAGGTSTG 
GTGAGAAATT 
CTCATCATGA 
GCTTGGCAGG 
ACAGRGTGTC 
BTGCA CCCAG 
TGGTCCTTTT 
CCCAGCCSkTG 



GACGACTTCA 



ATGATGCTGT 
AQATCTATCT 
TGACTGACAT 
CAGAGTTQGA 



AGACTTCACC 
CACTATGQAG 
ACCCCCTTTT 



AGCTTTTCCC 



CTGACTCATT 
CTCTTGGTAT 
CTCTAAAAGC 
GGCTGGAATT 
ATAAAA6CCC 
TCTCACCTAG 
GTTAAGGAAG 
ACACAGGAGA 
TTTACAAAAA 



ASTTQTAOAA 
CTTTATTCTA 
TACCCTCCTA 
TTTAAATGTC 
ACAAAACTCA 
CAAATQGTGa 
GTGAGCGCAT 
AAGATAGATC 
TTCCAGTCTA 
AQTAACCTSA 
TTCCAATTTO 
ACTTOTAGAO 
ACT 



TTTTASTTGG 



GAGAAATGTG 
TAACTGATAA 
TGAGCCAGTG 
CAATTAAAAA 
CTTQAGTTAG 
ACTAATCTGA 
ACAAAACCCA 



TGTCAATCCC 
TAGCA6CATC 
CTTCTAOACT 
TAATAGAATT 
TGTTGAACAT 
GTGCTGCTTG 
TTTGTATCTT 
TAGTAGTCAT 
CAGCCATCAA 
TCATCAGGAG 
TAGCACTAAT 
GTGCTAAATG 
AAATTAAAAC 
CATAATACAG 
TGTTAACCAA 
CTGTTCTTGT 
TTAATTCATA 



TCACAAGAAG 
AACGTTTTAC 
TTTAACACA6 



CAOTCAAACT 
AAGATGTCAA 
TGCTTATGCC 



GGCTAAATAC 
GTTAAAATGG 
GCCTAAGGTG 
ACTCCCTGGT 
ATAGTQAATG 
AACATCATAA 



CTACATACTC 



AGCTGAACAG 
AATGGGTATC 
CTACACTCAT 
CGTAGTCCAA 
GTAGTGTATT 
GTCTCTCTTT 
CCCATGAAGG 
TTGGTCACAC 
CAACTGAAAT 
AAAAAAAAGA 
TACTTTAACT 
TCTGTGGTTC 



AATCACAAAT A 



1140 
1200 
1260 
1320 
1380 

1500 
1S60 
1620 
1680 

1800 
1860 
1920 
1980 
2040 
2100 
2160 

2280 
2340 
2400 
2460 
2520 



70 
75 
80 
85 



Seq ID NO: 630 DNA sequel 
Nucleic Acid Accession Ct 
Coding eequencei 40.. 429 

11 



im_016639.1 



31 



41 



SI 



I l ilt 

GCGGCGGGCG CAGACAGCGG CGGGCGCAGG ACGTGCACTA TGGCTCGGGG CTOGCTGCGC 
OGOTTGCTGC GGCTCC?rOGT GCTGGGGCTC TGGCTGGOST TGCTGCGCTC CGTGGCCGGG 
GAGCAAQCQC CAGGCACOGC CCCCTGCTCC CXSOGGCAOCT CCTGGAQCCX: GGAC CTBQAC 
AAGTGCATGG ACTGCGCGTC TTGCAGGGOQ CGACO BCftCft. GOSACTTCIG CCTQOaCTGC 
GCTGCAGCAC CTCCTGCCCC CTTCCGGCIG CTTTGGCCCA TCCTTGGGGG CGCTCTGRGC 
CTOACCTTCG TGCTGGGGCT GCTTTCTGOC TTTTTGGTCT GQAGAOJATG COQCAGGAGA 
GAGAAQTTCA CCACCCCCAT AGAGGAGACC GGCGQAGAOQ GCIGCCCAGC TGTGQCGCTG 



428 
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ATCCAOTGAC AATGTGCCCC CTGCCM3C0S GGGCTOGCXC ACTCATCATT CATTCATCCA 480 

TTCTAGAGCC AGTCTCTGCC TCCCAOACGC GGCGGGAGCC AAGCTCCTCC AACCACAAGG 540 

GGGGTGGGGG GaSGTOAATC AtXTTCTCAGG CCTGC3GCCCA GGGTTCAGGG GAACCTTCCA 600 

AGGTGTCTGG TTGCCCTGCC TCTGGCTCCA GAACAGAAAQ GGAGCCTCAC GCTcaCTCAC 660 

ACAAAACAGC TGACACXQAC TAAG6AACTQ CAGCATTTGC ACAGGGGAGG GGGGTGCCCT 720 

CCTTCCTTAa OACXTraGOGO CCAG6CTGAC TTQGGGGGCA GACTTQACAC TAGGCCCCAC 780 

TCACTCAGAT eTCCIGAAAT TCCACCAGQQ QGGTCACCCT GGGOGGTrAG OQACCTATTT S40 

TTAACACTAa GGQCIGGCCC ACTAGOAGGG CTGOCCCTAA SATACASACC CCCCOVACTC 900 

CCCAAAGOGG G6AQGAQATA TTTATTTTGQ GGAOAGTTia GAQ6G6AGGG AGAATTTATT 960 
AATAAAAGAA TCTTTAACTT TAAAAAAAAA AAAAAAAA 

Seq ID NOi 631 Protein sequence 
Protein Accession S< NP_057723.1 

1 11 21 31 41 51 

I I I i I I 

MARGSIiRRUi RIiLVUSLHIA IjLRSVAGBQA PGTAPCSROS SWSADU)KCM DCASCRARPH 60 

SDPCLGCAAA PPAPPRM-WP ILGGAIiSIiTF VLQLLSOFLV HRKCRRHBKF TTPIEETSGB 120 
GCPAVALIQ 

Seq ID HO I 632 DHA sequence 

Kuclelc Acid Accession #: KM_003816.1 

Coding sequences 7 9.. 253 8 

1 11 21 31 41 51 

I.I I I I I 

CGGCAGG6TT GGAAAATQAT GOAAGAGGCXS GAG6TGGAGG CGACX:GAGTG CTGAGAGGAA 60 

CCTtSOGGAAT COGCCGAQAT aOGQTCTGGC GCX3CQCTTTC CCTCGGGGAC CCTTCGTGTC 120 

CGGTGGTTGC TGTTGCTTGO CCTOGTGGGC CCAGTCCTCO GTGCGGCGCG GCCAGGCTTT 180 

CAACRGACCT CACATCTTTC TTCTTATGAA ATTATAACTC CTTGGAGATT AACTAGAGAA 240 

AGAAGAGAAG CCCCTAGGCX: CTATTCAAAA CAAQTATCTT ATOTTATTCA GGCTGAAGGA 300 

AAAGAGCATA TTATTCACTT GGAAAGQAAC AAAGACCTTT TGCCTGAAGA TnTGTGGTT 360 

TATACTTACA ACAAGOAAOQ GACTTTAATC ACTGACCATC CCAATATACA GAATCATTGT 420 

CATTATOGOQ GCIATCTGGA GGGAGTTCAT AATTCATCCA TTaCTCrTAa CQAClGTrTT 480 

GGACTCAGAG GATTGCTGCA TTTAGAGAAT GCQASTTATG GGATTOAACC CCTGCAGAAC 540 

AGCTCTCATT TTGAGCAC3VT CATTTATCGA ATGGATGATG TCTACAAAGA GCCTCTGAAA 600 

TGXGGAGTTT CCAACRAGGA TATAGAGAAA GAAACTGC7UV AGGAT6AAGA GGAAGAGCCT 660 

CCCAGCATOA CTCAGCTACT TCGAAGAAGA AGAGCTGTCT TGCCACAGAC CCGGTATGTG 720 

GAGCTGTrCA TTGTCGTAQA CAAGGAAAGG TATOACATGA TGGGAAGAAA TCAGACTGCT 790 

GTGAGAGAAG AGATQATTCT CCTGGC3UVAC TACTTGGATA GTATGTATAT TATGTTAAAT 840 

ATTOGAATTG TGCTAGTTGG ACTGGAGATT TOSACCRATG GAAACCTGAT CRACATAGTT 900 

GGGGCrCGCTG GrGATGTGCT GGGGAACTTC GTOCASTGGC GGOAAAAGTT TCTTATCACA 960 

CQTCaOAGAC ATGACAG3GC ACAGCTAeTT CTAAAGAAAG aTTTTGSTGO AACTGCAOOA 1020 

ATOOCATTTG TaGaAACAOT GTGTTCAAS6 AGCCACGCAa GCGGGATTAA TGTGTTTGGA 1080 

CAAATCACTG TOGAGACATT TGCTTCCaVTT GTTGCTCATQ AATTGGGTCA TAATCTTGGA 1140 

ATGAATCAOQ ATGATGGGAG AGATTGTTCC TGTGGAGCaUV AGAGCTGCAT CATGAATTCA 1200 

GGAGCATOSG GTTCCAQAAA CTTTAGCAGT TGCAGTGCAG AGGACTTTGA GAAGTTAACT 1260 

TTAAATAAAG OAGGAAACTG CCTTCTTAAT ATTCCAAAGC CTGATGAAGC CTATAGTGCT 1320 

CCCTCCTGTa GTAATAAGTT GGTGGAOSCT GGGGAAGAGT GTGACTGTGG TACTCCAAAG 13 BO 

GAATGTGAAT TGGACCCTTG CTGCGAAGGA AGTACCTGTA AGCTTAAATC ATTTGCTGAG 1440 

TGTGCATATG GTGACTGTTG TAAAGACT6T tXSGTTCCTTC CAGGAGGTAC TTTATGCCGA ISOO 

GOAAAAACCA GTGAGTGTGA TGTTCCAGAG TACTGCAATG GTTCTTCTCA GTTCTGTCAG 1560 

CCAGATGTTT TTATTCaGAA TGGATATCCT TGCX»GAATA ACAAAGCCTA TTGCTACAAC 1620 

GGCATQTGCC AGTATTATGA TGCTCAATGT CAAGTCATCT TTGGCTCAAA AGCCAAGGCT 1680 

GCCCCCAAAG ATTGTTTCAT TGAAGTCAAT TCTAAAGGTG ACAGATTTGG CAATTOrGGT 1740 

TTCTCTOGCA ATGAATACAA GAAGTGTGCC ACTSGGAATG CYnO ' J tf J O a AAASCTTCAQ 1800 

TGTOAGAATQ TACaVAGAGAT ACCTGTATTT GGAATTGTGC CTGCTATTAT TCaAAOGCCT 1860 

AQTOQAGaCA CCAAATGTTG GGGTGTGOAT TTCCAGCTAG GATCAGATGT XCCAOATCCT 1920 

GGGATGGTTA ACXSAAGGCaC AAAATGTGGT GCTCGAAAGA TCTGTAGAAA CTTCCAGTGT 1980 

GTAOATGCTT CTGTTCTGAA TTATQACTGT GATGTTCAGA AAAAGTGTCA TGGACATGGG 2040 

GTATGTAATA GCAATAAGAA TTGTCACTQT GAAAATGGCT GGGCTCCCCC AAATTGTGAG 2100 

ACTAAAGGAT ACXX3AGGAAG TGTGGACAGT GGACCTACAT ACAATGAAAT GAATACTGCA 2160 

TTGAQGGACQ OACTTCTGGT CTTCTTCTTC CTAATTGTTC CCCTTATTGT CTGTGCTATT 2220 

TTTATCTTCA TCAAGAGGGA TCAACTGTGG AGAAGCTACT TCAGAAAGAA GAGATCACRA 2280 

ACATA:&GAGT CAGATGGCAA AAATCAAGCA AACCCTTCTA GACAGCCGGG GAGTGTTCCT 2340 

CQACATGTTT CTCXAGTOAC ACCTCCCAGA GAAGTTCCTA TATATQCAAA CaOATTXGCA 2400 

GTACCAACCT ATGCAGCCAA GCAACCTCAG CAGTTCXXXT CSAGGCCACC TCCACCACAA 2460 

CCGAAAGTAT CATCTCAGGG AAACTTAATT CCTGCXXX3TC CTGCTCCTGC ACCTOCTTTA 2520 

TATAGTTCCC TCACTTGATT TTTTTAACCT TCTTTTTGCA AATGTCTTCA GGGAACTGAG 2580 

CTAATACTTT TTTTTTTTCT TCSATGTTTTC TTGAAAAGCC TTTCIOTTGC AACTATGAAT 2640 

GAAAACAAAA CACCACAAAA CAGACTTCAC TAACACAGAA AAACAGAAAC TQAGTGTGAG 2700 

ASTTGTGAAA TACAA6QAAA T6CAOTAAAG CCAGGGAATT TACAATAACA TTTCCGTTTC 2760 

CATCATTOAA TAAGTCTTAT TCAGTCATOG GTGAGQTTAA TGCACTAATC ATGGATTTTT 2820 

TGAACATOTT ATTOCAGTGA TTCTCAAATT AACTGTATTG GIGTAAOATT TTTGTCRTTA 2880 

AGTQTTTAAG TGTTATTCTG AATTTTCTAC CTTAOTTATC ATTAATGrAQ TTCCTCATTG 2940 

AACATOTQAT AATCTAATAC CIQTGAAAAC TGACTAATCA GCTOCCAATA ATATCTAATA 3000 

TTTTTCATC» TGCACGAATT AATAATCATC ATACTCTAGA ATCTTGTCXO TCACTCACTA 3060 

CATGAATAAG CAAATATTGT CTTCAAAAGA ATGC»CAAQA ACCACAATTA AQATGTCATA 3120 

TTATTTTGAA AGTACAAAAT ATACTAAAAG AGTGTGrGTG TATTCACGCA GTTACTOGCT 3180 

TCCATTTTTA TGACCTTTCA ACTATAGGTA ATAACTCTTA GAGAAATTAA TTTAATATTA 3240 

GAATTTCrAT TATGAATCAT GTGAAAGCAT GACATTOGTT CACAATAGCA CTATTTTAAA 3300 

TAAATTATAA GCTTTAAGGT ACQAAGTATT TAATAGATCT AArCAAATAT GTTGATTCAT 3360 

GGCTATAATA AAGCAGGAGC AATTATAAAA TCTTCAATCA ATTGAACTTT TACAAAACCA 3420 

CTTQAGAATT TCAIOASCAC TTTAAAATCT GAACTTTCAA AGCTTGCTAT XAAATCATTT 3480 

AQAATdTTTA CATTTACTAA GGTOIGCTGG GICAIGTAAA ATATTAGACSV CTAATATTTT 3S40 

CATAGAAATT AGGCIGGA6A AAGAAGGAAO AAATtSGTTTT CTTAAATACC TACAAAAAAG 3600 

TCACTtnOGT AXCTATGAQT TATCATCTTA GCTGIOTTAA AAATGAATTT TTACTATGGC 3660 
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AGATATGGTA TaOATCQTAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCRTAAT 3720 
AAA6TTTAAT AATAQGrTTA TTAACTGAAT TTCATTAGTT TTTTAAAAGT GTrTTTGGTT 37B0 
TGTGTATATA TACATATACR AATACAAC&T TTACAATAAA TAAAATACTT OAAATTCTCA 3840 
AAAAAAAAAA AAAAAAAAAA AAAAA 



1 11 21 31 41 51 

I I I I 1 i 

MOSGARFPSG TLRVRWLLLL GLVGPVLGAA RPGPQQTSHL SSYEHTPWR LTRBRREAPR 
PYSKQVSYVI OAEGKEHIIH liERHKDLIjPE DEWYTYHKB QTLITDHPHI QNHCaiVRGYV 
EGVHNSSIAL SDCFGLRGLL HLEMASYGIE PLQNSSHFEK IIYRMDDVYK EPLKCQVSNK 
DIEKETAKDE EEEPPSMTQI. LRRRRAVLPQ TRYVELFIW DKERYDMMGR KQTAVREEMI 
LLANYUDSMY IMLNIRIVLV GIjEIWTNGNL IHIVGGAGDV I/ajFVQWREK FI<ITRRilHDS 
AQLVIJCKGFQ GTAEMAFVGT VCSRSHAGGl NVFGQITVET FASIVAHELG HNtiO«NHDDG 
SDCSCGAXSC IMNSGASGSR NFSSCSAEDF EKLTLHKGGM CLUIIPKPDE AYSAPSOOtrK 
LVDAGEECDC GTPKBCELDP CCEGSTCKLK SFAECAYGDC CKDCSPIiPGG TLGRGKTSEC 
DVPEYQJQSS QFCQPDVPIQ HGYPOINNKA YCYNGMCQYY DAQCQVIPGS KAKAAPKDCF 
lEVHSKGDRF GMCGFSGNEY KKCATGNALC GKLQCENVQE IPVFGIVPAI IQTPSR6TKC 
MOVDFQLGSD VPDPGMVNEG TKCGAGKICR KFQCVDASVI. NYDCDVQKKC HGHGVCaJSMK 
HCHCENQHAP PNCBTKGYGQ SVDSGPTYNB MCTTALRDGLL VFFFIiIVPI,! VCAIFIFIKR 
DQLWRSYFRK KRSQTYESDQ KNQANPSRQP GSVPRHVSPV TPPREVPIYA NRFAVPTYAA 
KQPQQPPSRP PPPQPKVSSQ GNLIPARPAP APPLYSSLT 

Seq ID NO: 634 DMA sequence 

Kuclelc Acid Accession tt: NK_002091.1 

Coding sequence: 56.. 5 03 

1 11 21 31 41 51 

I I I I I I . 

AGTCTCTGCT CTTCCCAGCC TCTCCGGCGC GCTCCAAGGG CTTCCCXjTOG GOACCaTGCXS 
CGGCAGTGAG CTOCOQCTGQ TCCIGCTGGC GCTGOTCCTC TGCCTAGCGC CCOGGGGGOG 
AGCGGTCCCG CTGCCTGOGG GCGGAGGGAC OGTGCTGACC AAGATGTACC OCCGCGQCAA 
CCACTGGGOG GTGaOBCKCrt TAAT6GGGAA AAAQAGCACA OGGGASTCTT CTTCTGTTTC 
TGAGAGAGGG AGCCTGAAOC AGCAGCTGAG AGAGTACATC AGGTGGGAAG AAQCTGCAAQ 
GAATTTGCTG GGTCTCATAG AAGC»AAGGA GAACAGAAAC CACCAGCCAC CTCRACCCAA 
GGCCTTGGGC AATCAGCAGC CTTCGTGGGA TTCAGAGGAT AGCAGCAACT TCAAAGATGT 
AGGTTCAAAA GGCAAAGTTG GTAGACTCTC TGCTCCAGGT TCTCAACGTG AAGGAAGGAA 
OCXXXaGCTQ AACCAGCAAT GATAATGATG GCCTCTCTCA AAAGAGAAAA ACAAAACCCC 
TAAOAGACTQ AaTTCTOCAA GCATCAQTTC TACGGATCAT CAACAAGATT TCCTTGTGCA 
AAATATTTGA CTATTCTGTA TCTTTCATCC TTGACTAAAT TCeTGATTTT C3«GC»GCAT 
(.■ ri - Ca jU TT T AAACTTGTTT CCIGTQAACA ATTGTCGAAA ROAGTCTTCC AATTAATGCT 
TTTTTATATC TAOQCTACCT GTTGGTTAGA TTCRAGGCCC CGAOCTGTTA CCATTCACAA 
IV AACACAT 



DVOSKQEWOR LSAFGSQRES RWPQLNQQ 

Seq ID N0> 636 DNA sequence 

Nucleic Acid Accession «i HH_0ieS22.1 

Coding sequence: 365.. 1299 

1 11 21 31 41 51 

I I I I I I 

GCQQAftGCAO OQAQGAGGGA GCCCCCTTTG GCCGTCCTCC GTGGAACCX3G TTTTCOQAGG 60 

CFGQCAAAAa CCOAGQCTGO ATTTGGGGGA GGAATATTAG ACTCGGAGGA GTCTGCGCGC 120 

TTITCTCXrrC CCCXaoSCCTC CCGGTCGCOG CGGGTTCACC GCTCAGTCCC OGCX3CrCX3Cr 180 

CCQCACCCCA CCXaVCTTCCT GTGCTCGCCC GGGGGGCGTG TGCCGTQCGG CT GCOGG AOT 240 

TaSOGGAAGT TOTOGCrGTC GAGAATGGGG GTCTGTGGGT ACCTGTTCCT GCCCTGGAA6 300 

TGCXn'OGTGG TCGTGTCTCT CAGGCTGCTG TTCCTTGTAC CCACAGGAGT GCCOQTGCGC 360 

AGCGGAGATG CCACCTTCCC CRAAGCTATC GACAACGTGA CGGTCCGGCR GGGGGAGAGC 420 

GCCACCCrCA GGTGCACTAT TGACaVACCGG GTCACCCGGQ TGGCCTGGCT AAACCX3CAGC 480 

AOCATCCTCT ATOCTQGGaA TOACAAGTGG TGCXTTOGATC CTCGCGTGGT CCTTCTGAGC 540 

AACACCX»AA CXJCAOTACAa CATCGAGRTC CAQAACX?rGG ATGTGTATGA OGAGGGCXKT 600 

TACACCTGCT 06GTGCAGAC AGACAACCAC CCAAAGACCT CTAGGOTCCA CCTCATTQTG 660 

CAAGTATCTC CCAAAATTGT AGAGATTTCT TCAGATATCT CCATTAATGA AGGGAACAAT 720 

ATTAGCCTCA CCTGCATAGC AACTGGTAGA CCAGAGCCTA CQGTTACTTG GAGACACATC 780 

TCTCCCAAAG CGGTTGGCTT TGTGAGTGAA GACX3AATACT TGGAAATTCA GGGCATCACC 840 

COGGAACaGT CAGGGGACTA CGAGTGCAGT GCCTCCAATO ACGTOGCGGC GCCOOTGGTA 900 

0GGAGA6TAA AGGTCACOGT GAACTATCCA CCATACATTT CAGAAGCX^AA QQGTACAGGT 960 

GTCCCCGTGG GACRAAAGGG GACACTGCAG TGTGAAGCCT CAGCAGTCCC CTCAGCAGAA 1020 

TTCCAGTGGT ACAAGGATGA CAAAAOACTG ATTGAAGGRA AGAAAGGGGT GAAAGTGGAA 1080 

AACAGACCTT TCCTCTCAAA ACTCATCTTC TTCAATGTCT CTGAACATGA CTATGGGAAC 1140 

TACACTTGCQ TGGCCTCCAA C»AGCTGQGC CACACCAATG CCAGCATCAT GCTATTTGGT 1200 

CCAGGCeCCG TCAGCGAGQT QAGCAACGGC ACGTCGAGOA GGGCAGOCIG OGTCTGaCTG 1260 

CTGCCTCTTC TGGTCTTGCA CXTTGCTTCTC AAATTTTGAT QTQAOTGCCA CTTCCCCaWS: 1320 

CGGGAAAGGC TCCCXSCCACC ACCACCACCA ACACAACAGC AATGGCAACA COOACAGCAA 1380 

CCaVATCAGAT ATATACAAAT OAAATTAGAA OAAACACMC CTCATGOOAC AGftAATTTCa 1440 

GGQAGGGQAA CAAAGAATAC TTTGGGGQGA. AAAOAGTTTT AAAAAAGAAA TTGAAAATTa 1500 

CCTTQCAGAT ATTTAG6TAC AATGQAGTTT TCTTTTCCX» AACGGOikAGA AC*C»aCRCA 1560 
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10 



20 
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ccasGcrraa acccactgca agctgcatog tgcaacctct ttgqtgccaq lartaaacMG leao 

GGCTCACSCXrr CTCTGCXX»C AGACTGCCCX: CACGK3GAAC ATTCTGGAGC TGGCCATCXX: 1680 

AAATTCAATC AGTCCATAC3A GACGAACAGA ATGAGACCTT CCQGCCCAAG CGTGGGGCTT 1740 

CCGGCCCAAG CGTGGCGCTG CGGGCRCTTT GGTAGACTGT GCCRCCACGG CGTOTOTTQT 1800 
GAAAOGTGAA ATAAAAAGAG CAAAAAAAAA AAAAAAAAA 

Seq ID NO: 



PCT/US02/12476 



11 



21 



31 



51 



111 
MGVaSYIiFLP MKCLWVSIiR LIiFIjVPTGVP VRSGDATFPK AMDMVTVRQG ESATUICTID 
NRVTRVAWUI RSTILYAGKD KWCLDPRVVL LSNTQTQYSI EIQNVDVYDE GPYTCSVQTD 
HHPKTSRVHL IVQVSPKIVE ISSDISINEQ NMISLTCIAT GRPBPTVTWH HISPKAVQFV 
SEDEYLBIOG ITREQSGDYS CSASHDVAAP WRRVKVTVN YPFYISBAKO TGVFVGQXGT 
LQCBHSAVFS AEFQNYXDOK RLIEGKKGVK VESRFFLSKL IFFKVSEHDY QIYTCVASinC 
LSITNASIMI. FSPGAVSEVS HQTSRHAGCV HU.PLLVUIL LLKF 

Seq ID NO: €38 ONA aeijuence 

Nucleic Acid Accession ft: NM_012261.1 

Coding sequence: 203.. 1045 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GATTTGCTCr GCCAOCASCT 
ACAGAATAGG OGCTCCCTCX: 
CACTCCAGCG GCGACTTTGA 
OCICAnOQG G6CACTGCGA 



OQAAAATCTC TC3W3GCCTTT 
TGOGACQACG TGTCTCATGG 
GGCCASCAAC TACGTAGATC 



CXKATATOCA CTCAAAATQC 



GGGATTCCCT 
GTATGGATCT 
TGTTCCATAC 
CCACTAACCC 
CAGAGTTTGC 
TGATCACAGA 
OCCACAqCCA 
TCTTTGTAAA 



CTCTGGCGGC 



ACAGGCCGAT 
GTCQGAGCTG 
GGAAAGCChC 



GCCTCTCGCT CRCCCCGGCX: 
CTCTGCAGCA GCACAGCCGG 
GGGGTCCCCA GCAT03ACAQ 
ATCATGGCAG AACAAGAAGT 
ATATTTGTGG TGCGGGAAAA 
ATTGTACCTT ATGATQTGTG 



CAAGTGT7CT GQOTGGATOG 540 
AACAT6TCCA 
TCCrOGGAGA A 



AGCTCAACAA ACCATTTCAC 



CAAAGACGCA GTCAGTGCTa GOAAGCACAC AGCCAACTCG CACCACCTCT CTGCCTTGGT 720 
CACCCCCGCT GGGAAOTCCT ATGAGTGTCA 
TGATCCGCAG AAGACGGTCav CCATGATCCT 
TATCTCAGAT TTTGTCTTCA GTGAAGAGCA 
GGAAGAAACC TTGCCCCTGA TTTTGGGGCT 
OQCQATTTAC CACmCCACC ACAAAATGAC 
ATCCCAGTAT AAGCACATGa 



ATGCTGGGGA 
TGACTCTCCA 
rrQAAAACAT 
TGCTCCCTTG 
TCATGCTCCC 
CTTTAGTGAT 
AAAAOBACTA 



AAGAGCAATA 
GCTTCTTTGA 
6ACACAGCTG 
TGC31GCAAGA 



CCTGGGTATC 
TCTTTOGGAT 
AGGGTCTCAG 
AATGCCACTT 
GGAGGAAACC 
GCTTATCCTA 
CXXrCTGAAAG 
ATGTTTCACT 
GCAGAOTTGT 



GTCTGOGGTC 
TAAATGCCXa 
CATCTTGGGC 
TGCCAACCAG 
6TTAGGC3UX3 
CTTTTCCATC 
TOAOOCTTGC 
TTGTAGGGTG 
ACAGCTTTCG 
GOAOCTOTAT 
CCTTTAGGTT 
TAC3«5TTGTC 
TGATTCATGC 
GCTACCCGCA 
TTGGACTTCT 



TOGCCTCTAQ 
CTTTTGACAT 
GGGAGCAACT 
TGGTAACACT 
CTCGGGACAG 
TCCTGCTCCC 



AAATGGCAAT 
TGCraVTGGT 
CTGGCCCCAA 
CAGAAGAATA 
AATGCACACA 
TTCTGGCTGG 
TCCAGGGACT 
TCCTGTGCCA 
AAATGAAATA 



TCCATGCTTA 
TATTCTCTCC 
GGCTTGGCTT 
AGTTTAGGGA 
TGGGGTGCTT 
GAATACAACC 
CATTCTGCAT 
GCAGCACCAG 
G6TCCAAOTC 



TTCTCTGQC 



: NP_036393.1 



21 



41 



SI 



GTCTCTGCIG 1 



GTGGTA6CCT 
AAGAAAQTCA 
ACKATGCATC 
CAGTAAGAAT 
QAAaAOTGTQ 
CTAATATACT 



TTGQTAAACT 
CGCTGAA6AA 
TCa«3AAAAT 
ATAAAATTGC 
AAGAAGGAAG 



ACTCACTCTT CTCKTAAAAT 



GCAGGTGTTC 
CGGGAAGCAA 
TTTGGACAGT 
CCAGTCTTCA 
GGTTGGTTTT 
TAOGCTTCTC 
TTTACTOTTA 
AAAGAATCAC 
CTATCATACA 
TTTATTAGTQ 
AGGAAATATT 



CCOGCAGGCC 
GTTTGTCTGG 
GGAAACAAGA 
GCGG7U3CAGT 
TTTCCATTTT 
CCTQAAOTTT 
TTTTACCTQA 
TGGTTATTIU3 
TTCCTTAAAG 
TGCTGT TGftG 
TTAGTTCTGT 



CGCAGTGCTC 
ACCCGOAAGC 
AAAACTGAGT 
TTTCTGGAGA 
CTACATGGAT 



1020 
1080 
1140 
1200 

1320 
1380 
1440 
ISOO 

1620 
1680 
1740 



I I I I I i 

MDIiOGKOVPS IQRIiRVliIML PHTOAOIMAB QEVEHLSOLS TKPEKDIFW RENGTTCLMA 
EFAAKFIVPY DVWASNYVDL ITEQADIALT R6AEVKGRCX3 HSQSEI.QVFW VDRAYALKMIj 
FVKESHNMSK GPEATWRLSK VQFVYDSSEK THFKDAVSAG KHTAHSHHLS ALVTPAGKSY 
ECXIAQQTISL ASSOPQKTVT MII.SAVKIQP FDIISOFVFS EEHKCPVDER BQtiEETIiPI.1 
LQliILGIiVIH VTLAIYHVKU KMTANQVQIP RDRSQYKHMG 

Seq ID MO I 640 VSR sequence 

Nucleic Acid Accession fti NM_002993.l 

Coding sequence: 64.. 408 

1 11 21 31 41 51 

I I I I I I 

GGCAOGAGCC AGTCTCCGCG CCTCCACCCA GCTCAGGAAC CCGCGAACCC TCTCTTGACC 
ACTATGA6CC TCCOGTCCAQ CCGCGCGGCC CGTGTCCCGG GTCCTTCGGQ CTCCTtGTGC 
3 CGCT6CTGCT CCTGCTGAOS CCGCCGGQOC CCCTC6CCAO OGCTGGTCCT 



CA AGSIG BAA 
CCCTTTTCTA 
AACAAAAAAG 
TCCCTGGACC 
TCCCTACTTT 
TAAXGAAGTA 



TTTCTTGGQO 



AATATTGAAT 
AAGGCTGTGG 
TGTTGTTCTT 
AATATGTTAC 
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TCTTTACCCT AGGATGCTAT TTAAGTTGTA CTGTATTAGA ACACTGGGTG TGTCATACCG 
TTATCTGTGC AGAATATATT TCCTTATTCA GAATTTCTAA AAATTTAAGT TCTGTAAGGG 
CTAATATATT CTCTTCCTAT GGTTTTAGAT GTTTGATGTC TTCTTAGTAT GGCATAATGT 
CATGATTTAC TCATTAAACT TTGATTTTGT ATGCTATTTT TTCACTATAG GATGACTATA 
ATTCTGGTCA CTAAATATAC ACTTTAGATA GATGAAGAAG CCCAAAAACA GATAAATTCC 
TGATTGCTAA TTTACATAGA AATGTATTCT CTTGGTTTTT TAAATAAAAG CAAAATTAAC 
AATGATCTGT GCTCTGCAAA GTTTTGAAAA TATATTTGAA CAATTTGAAT ATAAATTCAT 
CATTTAGTCC TCAAAATATA TACAGCATTG CTAAGATTTT CAGATATCTA TTGTQGATCT 
TTTAAAGGTT TTGACCATTT TGTTATGAGG AATTATACAT GTATCACATT CACTATATTA 
AAATTGCACT TTTATTTTTT CCTGTGTGTC ATGTTGGTTT TTGOTACrTG TATTGTCATT 
TGGAQAAACA ATAAAAGATT TCTAAACCAA AAAAAAAAAA AAAAAAA 



I I I I 1 I 

MSLPSSRAAR VFGPSGSLCA LUUiLIiLLTP FGFLASAGPV SAVI.TELRCT CUtVTLRVNP 
KTIGKIiQVFE> AGFQCSKVEV VASLKNGfCQV CI>DPBAPFLK KVIQKILDSO NKKN 

Seg lO NO: £42 DMA oeguence 

Nucleic Acid Accession »: HM_01327l.l 

Coding sequence: 27.. 809 

1 11 21 31 41 SI 

I I I I I I 

TCOSGAQCCA GGCTCGCTGG GGCAGCATGG CGGGGTOGCC GCTGCTCTGG GGGCOGCGGQ 
CCGGGGGC6T CGGCCTTTTG GTGCTGCTGC TGCTCGGCCT GTTTCGGCCG CCCCCOJCQC 
TCTGCGCGCG GCOSGTAAAG GAACCCCGCG GCCTAAGCGC AGCGTCTCCG CX:CTTGaCTG 
AGACTGQCGC TCCTCXSCXXSC TTCCS3GCGGT CAGTGCCCCG AGGTGAGGCX3 GCGGGGGCGG 
TGCAQGAGCT GGCGCGGGCG CTGGCGCATC TGCTGGAGGC CGAACGTCAG GAGCGGGCGC 
GGGCOSAGGC GCAGGAGGCT GAGGATCAGC AGGOGOGOST CCTQGCGCAG CTSCTGCGCG 
TCTGGGGCGC CCCCCGCAAC TCTQATCCGQ CTCTGGGCCT GOAOSACXBU: CCCQAOSCXX: 
CTGCAOCGCA QCTOOCTOGC GCTCTGCTCC QCXKXXGCCT TGACCCTOCC QCCCTA6CAQ 
CCCAGCTTQT CCCOGOGCCC GTCCCCX3CCG OGGOGCTCCQ ACCCXSGOCC OCGGTCTACn 
ACGACGGCCX CGCGGGCCCG GATGCTGAGG AGGCAGGCGA CGAGACAOX: GACGTGGACr 
CCGA6CTGTT GAGGTACTTG CTGGGACGGA TTCTTGCGGG AAGCGCGGAC TOCGAGGGGG 
TGGCAGCCCC GCGCCGCCTC CGCCGTGCCG CCGACCACGA TCTGGGCTCT GAGCTGCCCC 
CTGAGGGCGT GCTGGGGGCG CTGCTGCXTTO TGAAACGCCT AGAGACCCCQ GCGCCCCAGG 
TGCCTGCAOG CCGCCTCTTG CCACCCTGAG CACTGCCCGG ATCCXXSTGCa CXJCTGGGACC 
CAGAAOTGCC CCOSCCATCC CGCCACCAGG ACTTCTCCCC QCCAGCACGT CCAGAGCAAC 
TTACCCCGGC CAGCCAGCCC TCTCACCCGA GGATCCCTAC CCCCTGGCCC ACAATAACAT 
GATCTGAGC 



1 11 21 31 41 SI 

I I i I I I 

riAGSPUiWGP RAGGVGLLVI, U.LQtFSPPP ALCASPVKEP SGLSAASPPL AETGAPRRFR 
3 AVQELASALA HLLEAERQER ABAEAQEAED QQARVLAQLL RVHGAPRNSD 
3 APAAQLAKAL UtARLDPAAL AAQLVPAPVP AAAIAPRPPV YDDGPAGPDA 

^ RIIAGSAOSE GVAAPRRUta AADHDVGSBIi PPBGVLOAUi 

RVKRIiETPAP QVPARRUiPP 

Seq ID NO: 644 DKA sequence 
Nucleic Acid Accession #< NM_002214 
Coding sequence: fi81..2990 

1 11 21 31 41 51 

I I I I I I 

CCCAGAGCCB CCTCCCCCTG TTGCTGGCAT CCCGAGCTTC CTCCCTTGCC AGCCAGGACO 
CTGCXaSACTT QTCTTTGCXX: GCTGCTCCGC AGACGGGGCT GCAAAGCTGC AACTAATGGT 
GTTGGCCTCC CTGCECACCT GTGGAAGCAA CTGOSCTGAT TGATOOGOCR CftGACTTTTT 
TCCCCTCGAC CTCGCCGGCG TACCCTCCCA CAGATCCAGC ATCACCCAGT GAATGTACAT 
TAGGGTGGTT TCCCXX^CCAC CTTCGGGCTT TGTTTGGGTT TaKTTGTGTT TO GCTCT TOO 
CTAAGCTGAT TTATGCAGCA GAAGCCCXaVC CGGCTGGAGA OAAACAAAAG CTCTTTTCTT 
TCTCCCGGAG CAGGCTGCGG AGCCCTTGCA GAGCCCTCTC TCCAGTCQCC GCCGQGCCCT 
TGGCCGTCGA AGGAGGTGCT TCTCXSaGGAG ACCGCGGOAC CCGCCGTGCC GAGCCXSGGAQ 
GGCGGTAGGG GCCCTGAGAT GCCGAGCGGT GCCCGGGCXC GCTTACCTGC ACCGCTTGCT 
COQAGCOGOG GGGTCOGCCT GCTAGGCCTG CGGAAAACGT CCTAGOIACA CTCGCCCOOa 
GGCCGCXUOG TCGCCCGGGA GGCCGAGCCC GCGTCCX3GAA GGCAGCCAGG CGGCX3GGCGC 
GGGGGGGGCT QTTTTGCATT ATOTGOGGCT OGGCCCTGGC TTTTTTTACC GCTGCATTTG 
TCTGOCTGCA AAACGACOGQ COAGGTCCCM CCTCBTTCCT CTQQGCAfiCC TGGQTCTTTT 
CACTT6TTCT TG6ACIGGGC CAAGGTQAAO ACAATAGATG TGCATCTTCA AATGCAGCAT 
CCTGTGCCAG GIGOCTTGOG CTGGGTCCAG AATGTGGATG GTQTGTTCAA <3AGQATTTCA 
TTTCAGGTGG ATCAAGAAGT CtAACGTTGTG ATATTGTTTC CAATTTAATA AGC3UVAGGCT 
GCTCAGTTGA TTCAATAGAA TACCCATCTQ TGCATGTTAT AATACCCACT GAAAAT6AAA 
TTAATACCCA GGTGACACCA GGAGAAGTGT CTATCCAGCT GOSTCCAGGA GCCGAAGCTA 
ATTTTATGCT GAAAGTTCAT CXTTCTGAAGA AATATCCTOT GSATCTTTAT TATCTTGTTG 
ATGTCTCAGC ATCAATGCAC AATAATATAG AAAAATTAAA TTCCQTTGGA AACGATTTAT 
CTAGAAAAAT GGCATTTTTC TCCOGTGACT TTCXiTCTTGQ ATTTOQCTCA TAOGTTGATA 
AAACRGTrrC ACCATACATT A6CATCCRCC CCGAAAGGAT TCATA ATCAA TGCASTGACT 
ACAATTTAQA CTGCAT6CCT CCCCATQGAT ACATCCATQT QCTCTCTTTQ ACAGAGAACA 
TCACTGAGTT TGAQAAAQCA GTrCATAOAC AGAAGATCIC TG6AAACATA OATACACCAG 
AAGGAGGTTT TOACGCCATG CTTCAGGCM CTGTCrGTQA AAOICATATC GGATOGCXSAA 
AAGAGGCTAA AAGATTGCTG CTOOTGATGA CAGATCAGAC GTCTCATCTC GCTCTTGATA 



432 



10 
15 
20 
25 
30 
35 



WO 02/086443 

GCRAATTGGC AGBCATAGTO 
A03TCAAATC GACAACCATG 
ACAACRACAT TAATGTCATC 
TTCTACCCCT CTTGCCAC3GC 
ATAATTTCGT AGTOGAAQCC 
ACCAGGTACA AGGCATCTAT 
CAG6CATOGA AGGATGCAGA 
TTACAATGAA AAAATGTQAT 
GTTTTAATGA AACCQCTAAA 
ACAGAGQACC TAAAGGAAAO 
OTGATGAGAA TAAATGTCAT 
ACakAGGATCA GCCTGTTTGC 
ACAAAATTAA GCTTGGAAAA 
C31TATCACCA TGQAAATCTG 
GCTTCASTGG CTGGGAAG6T 
ZCAATTCAAA GGGCCAAGTG 



GTGCCCAATG 



TTTGCAGTTC 
ACCATTGCTG 
TATCAGAAGC 
rTTAACATTA 



ATTCATATAC 
TGTGTAGATG 
TTTGATGAAG 
AGTGGTOGAG 



ACGGAAACTG 
CACTAGGCCA 
AAGGAAAACA 
GTGAAATAGA 
TCATTTCAGA 
COGCCATCTG 
GCAATGATGA 
GAAAAAACTA 
ACAGAAACTQ 
AAACTTTTCT 
ATCAGTTTTC 



AATACTGTGA 
ATGGAQAOTG 
ASTGCCCTTC 



TCTTGArreG 

ATAAAATTAA 
TGCAAAGTGT 
TGGATATCAG 
TTAAACRCTT 



CAAAACCTCA 
CTCCAGCCCA 
GTTCCTTAAA 
GTCCTCATCA 



GAAQACTGAC 
AAAATGTGTC 
ACTTATTTAG 
TACCTGTTAT 



TATATTCTAA 
ATQAATAAAT 
AAASATTATT 



AAGTATCCTC 
TTACTACTGT 
ATCAGCATAG 
CCCTACGCTT 
GTACAGTAAT 
GGTTGCCAAA 



GCTTTTTAAA 
ggatactaat 
ATAAjGTTTAT 
AAAAACTAAT 



ATGCAATGCC 
TGTGCTCTCA 
AGCTACTTGA 
GTCCTGATCA 
6ATTACAQAG 
GCAGTCACCT 
GCTCAT6AAA 
TGQAATTGTT 
TTGCTCAOGG 
ATCATQATGT 
TTGAGACTAO 
AATGTAGATC 
CX:CAGAaAGA 
CCCTCGCACTG 
CACTTCAACA 
TCACTCTTTC 
GTGTGTAGTT 
TCCAGCA.TTC 
GTATOTCACA 
AATACAATGT 



GTOAACACTG 
TSCACCCKA 
TGGAACAACA 
GAATATTTTT 
TTAGACAGGT 
TGTCAGCCTC 



CTTTCAGGTG 
AATAATTGCT 
TCATGCCAGT 
iGACTCACATA 
TQTCX3TTGTA 
CTCTGAAGAG 
ACAATGCTGT 
GACATGTGAG 



TCATCTGAAA 
ACTTTCAGAG 
ATTTCATTGG 
ATCAAAGGCT 
AGTGAAAQTT 
TCCAGATGGG 
AGTTCTTTTC 
TGCAATAATC 
CAGCTGTCAG 
AGATTCCAAG 
TTCTGAGAGT 
TTGTGGGAAA 
AAAGGATGAC 
TGAAGCAGGC 
AGCAGCAGCC 
TGTSTGTGGA 
CCCCACCTGT 
CAATTTOTCT 
GCATTATGTC 
CATCATTTTC 
GATACTACAA 
AAAAAAGGAT 
GAAGCCPGAA 



AACAACGTCT 
AAATTAATAG 
TATAAGGATC 
GCAAACCTCA • 
CAG6TGGAAA 
TCCAGAAAGC 
AATOTAACAG 
AAACXH'ATTG 



TGTTTCCAGT 



TGTTCATGTC 
TTTTCTTGTC 
AOATGCCAAT 
CAGCACTQTG 
AGGTGTQAGT 
TATACAOCCT 
CAGGCTATAC 
GACCAAACTT 
ATAGTTACAT 
TGGAATAGTA 
AAGTTGATTC 
GAAATAAAAA 
AAAAAOATTT 
ATAA-rmAA 
ACACTCOAAC 



AAGAGGTQAA 
TTAIGCATer 
TCTCCTCTTT 
GATQACTGGA 
CACTTXATCA 



GCACTTTACT 
CACTGATTAC 
GAGAGAGTTT 
GAAAAAAATA 
GAATAGACAA 
CAQATACAAC 
GTQTTTATGG 
GCCTTTATQT 
TTAATTAAGT 



GTAATATATA 
ACTTTACAGG 
AGCATTGTGT 
ATCTGGCAAG 
GAACAGCTAG 
CTTAATCTTA 
TTTGCTTATT 
TTTGTTTTCr 
GCTAAGTrAC 



1620 

leaa 

1740 
1800 
IBSO 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2660 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3730 
3780 
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45 
50 
55 
60 
65 
70 
75 
80 
85 



LOPECGWCVQ 
GEVSIQLRPG 
SRDFRIjGFGS 
VHRQKISGMI 



TIAGEIBSKA 
HVTSNDEWI.P 
CVDBTFU>SK 



AAFVCLQNDR 
EOJFISGGSRS 
AEANFMLKVH 
YVDKTVSPYl 
DTPEGGFDAM 
NNVYVKSTTM 
ANLNNIiWEA 
NVTVTMKKCD 



RGPASFLHAA 
ERCDIVSNIiI 
PLKKYPVDLY 



FSCPYHHGML 
RCECTOFRSI 
DQTSECPSSP 
KLILQSVCTR 



LQAAVC3ESHI 
EHPSIiGQLSB 
YQKLISEVKV 
VTGGKNYAII 
FDEDQFSSES 
CAGHGECBAG 
GRFCEHCPTC 
SYIiRIFPIIF 
AVTYRREKPE 



WVFSLVLGLG 
SKGCSVDSIE 
YLVDVSASMH NNIEKLHSVG 
CSDYNLDCMP PHGY1HV1.SL 
LVMTDQTSm. 
FAVQGKQFHW 
FNITAICPDG 
IHIHSNCSCQ 
SGRGVCVCX3K 
DRCQCPSAAA 
YTACKBNWHC MQOaPHKLS 
VbllRQVIIiQ 



NDLSRra4AFF 180 

TENITEPKKA 240 

ALDSKLAGIV 300 

YKDI.I.PLI.PG 3 60 

SRKPGMBGCSt 420 

CEDNRGPKGK 480 

CSCHKIKLGK 540 

QHCVNSKGQV 600 

QAILOQCKTS 660 



ID KOi 646 DNA sequence 
Leic Acid Accession 8: NM_0033] 
, .2574 



ATGQAATCC6 I 



AACCCAGAGG 
GATGCTCTTT 
GATAAATATG 
GCTATTCAAG 
AAATTTGCTT 



GAAATTGCXC 
AAGAATTTAT 
CATTTACAGA 
•rTATATGGAG 

caaactaaca 
agcxx:agatt 
acxtctagat 
tcctgtgaat 
tcagatgaaa 
gaatcaagtc 



ACTGGTTGAG 
TAAATAAATT 
GCCAAAATGA 
AGCCAGATGA 
TTGTTCATAT 
AACTTCTTC3V 
TGCGGAATTT 
CAGCATCTAC 
ATAGGAACAA 
AGAACATGCC 
AAACTAAACA 
OTGATGTGAA 
CAGAATGCCG 
TAAGAAATTT 
AGAGTTCTGA 
TTCTAGCTAA 
AGAAACAGIG 
CAAATCACTG 



ATCTTTTGCA 
AAAAGCTGTA 
AAACCTCCAA 
GQTATTAACT 
CAGTTGTGAT 
ACCACAAGAT 
GTCATGCCCA 
GACAGATGAT 
AGATTTGGTT 
AAAGTCTGTT 
ACTTATTATT 
ATTAGAAGAA 



31 
I 

^ TTGACAATTG 
I QACCTTACTG 
ACTGTTAACC 
AAACTAGAGA 
TACAGTCAAG 
AGAATTCAAG 
TACTTTCAAA 
CAATTT6AAC 
GAACGTGGAG 



41 
I 

ATT0CA1AAT 
ATGAACTAAO 
AAATTATtSAT 
AAAACAGTOT 
CAATTQAAGC 
TGAGATTTGC 
TGGCCAGAGC 
TGTCACAAGQ 

cacTAOcavcr 

TGCTTTCAGR 



TCCGCTAAGT 



TGAATTAAAA 
AAACTGCAAG 
TAATGTCAAA 
AGAAATQCTG 



GCAGAAATAG 
■rrrGGAAGAG 
TCAGTTGTAC 
GTGCCTGGAT 
CAAAATAGTC 
ACTGATTCAA 
ACXAAAQAQT 



CTTGTTTTAT GAAAASACAA 900 
CTAAACCAAG 
AirrCAAGGA 
TAACCCT6AA 
ATCAASAACC 



AAACATACXA cn 
ACATCTAAAT GGTTTGACCC A 



ACCTCrGGTa 1020 
QAATAAAAOG 1080 
1140 
1200 
1360 
1320 
1380 



^ CCAOAATCCT 
A TACAGMGCAO 
: ACC AATA TCA 
i TACCTTGGAT 



433 
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GATTACATQA GCT6TTTTAS 
TTGTCAAOVC CTTATGGCCIV 
ACTCCACTTC AAAATTTACA 
AAAGGAAGAA TTTATTCCAT 
CA.GGTGTTAA ATGAAAASAA 
GATAACCAAA CTCTTQATAG 
CACAGTGATA AGATCATCCO 
GTAATGGAQT GTGGAAATAT 
CCATGGGAAC aCAAQAGTTA 
CATGGCATTG TTCACAGT6A 
AASCTAATTQ ATTTTQOQAT 



PCT/US02/12476 



TTACCGQAAC 
ACTTTATGAT 
TGATCTTAAT 
CTGGAAAAAT 
TCTTAAACCA 



GGATGTATTT 
ATTTCTAAAT 
GAGAAAGATC 
TCCATTCCTG 
ATGGCCAAGG 
TCTCCTAACT 
A6TCATAATT 



ATGGGAAATC 
TGTACTATAT 
TACATGCCAT 
TTCAAGATGT 
AGCTCCTGGC 
GAACXaWTTGA 
CCATTTTGAA 
CTTCATCCrC 



TAATTATATG 
TAAGTCAAAG 
GACTTACGGG 
AATTGATCCT 
GTTAAAGTGT 
TCATCCCTAT 
AGAAATGAAA 
AGCTG CTAAA 
CAAOACTTTT 



BIAIiRNIiNU] 
LYGENMPPOD 
TSRSECROLV 

KHTTFEQPVF 
LSTPYGQPAC 
QVLNEKKQIY 
VMECGNIDLN 
ICLIDFGIANQ 



AEIGYRNSLR 

vpgskpsoid 
Tkeyqepevp 
svskqsppxs 

FQQOQHQIIA 
AIKYVNIjEEA 
SHLKKKKSID 



QtNKTKQSCP 
SCELRNXiKSV 
ESNQKQWQSK 
TSKWFDPKSI 
TPLQNLQVIA 
DNQTLDSVRN 
PWERKSYWKN 

DsovorvtrcM 

ISKUIAIIDP 



SIPBUABPY V 



Seq ID NOt 648 OKA sequence 
Nucleic Acid Accession NM_015507 
Coding sequence: 241.. 1902 



accrcGGCCA ggctagccag 

GCGGTGCCTQ GCCTCCCCTC 
GGAGGACCCa AOCX3GCTGAG 



i i 
GGOGCCCCCA OCCCCTCCCC 
CCAGACTGCA GGGACAGCAC 



ATGCCTCTOC CCTGGAGOCT 
CCAGTGCAAG 
GAACTAAACT 
CATGCX3AACC 
CAGGATACAC 
CATGCCAACA 
ACATGCTCAT 
AGTACAGCTQ 



T6TGAAGCTA 
AGATGCTTTC 
AAACCCCGGC 
CTCAGTGGCC 
ATAAACIGTC 



AAA1GTCACA 
AATQAATGTA 
GGCTCCTTCA 
ATCCCTQAAA 
AAGAAGTTGC 
CCAGAACCCA 



TTGGTTTOGA 
CTATGGATAG 
AGTGTAAATG 
ATTCTGTGAA 
TTGCTCACAA 
CCAGGACTCC 



TGCGCrCCCG 
GCATCACGGG 
GGCCTGCTGC 
TGGATGTAAG 
CGGGAAAACC 
CAGATGTGTG 
GCCAGATGCT 
TQAAQACACA 
AAATGQAAOA 
CAATOGAASA 
ACTGCAATAT 
CX»TACGTGC 
CAAGCAGGGA 
GGAAGTCCTC 
AAACAGCATG 



CTGCTGCTCI 
TTGTTAGCAT 
TAOSGCTGGA 
TTTGGTGAGT 
TGCAGTCAAG 
AATACACACQ 



GAAGAAGGGC 



CGGCACGTCA 

GCGTGGGACC 
ATGTQAATGA 
GAAGCTACAA 
ACTCTAGGAC 
CACAGTGCCT 
ATATrGATCA 



AACTTCTGTT 
TTTGTGAAAA 
TGGAAGACAG 
GAAGCAGAAC 
TCRGGCTTAT 
TTGACTTTGT 
TTAGAATTAC 
TCTTGTATAA 
TTTCTGAATC 
CAGTATATCT 



TCAATCATGG 
CTGATCGAGA 
AAGACATTGG 
TGCTCTTTGA 
ACAGTAACAA 
GGAAAATTCA 
GTGGCAAGGG 
GTCCAGATAG 
ATGTCAGTTC 
TAGCTQAAAA 
GATATGCCAA 
TTTCCACATT 
GATTTGTATA 



\ AAGATAGACT 



AACTTCCAAA 
GATCTGTGAC 
TAATGCTATT 
CCGATTGAAA 
TTACCGGCTG 
TGCCCrOGCA 
GTTGTATCAA 
CAAAACCGGC 

CCTGGTTTTT 
ATTGTAATGT 
TATTTGCTTT 
ATATTATAAA 
AGTAAGTTGA 
AATGTTTAAC 
TrroCCTAAG 



AGCCACCATG 
TATAAAGGCA 
AOAGCRCCTG 
AAAAAGAAGG 
GTGAACTTGC 
GGTAAAAAAG 
AAAGCCXTTGA 
3 OIGAATQAAQ 



QATATGACTG 
CCAATTGCTT 
ATGGACTTCG 
GTACCATCAA 
CAftAAATTAA 
AGCCCTTCAA 



TG6AAA(»GG 
GGCTTCTATA 
CTTCTCCTAC 
GCCXX»GACA 



AGAATGACAT 
CAGGTOAATT 
AASATTTAAA 



GGAACTQATG 
GAAATC6CAG 
GTGGATGACT 
TTGATATTGC 
ACCAACAGAA 
AAATATCATA 
ATATGGAAKI 
TG AGCT TCTC 
TGTTTQACTC 
TCGCTTAGCT 



TGGCAGTTCC 
CTGACCTGCA 
AAGTCGGGAA 
CCACQAGTGA 
CTACCAAAAG 
TGGATGGCGT 
GAATGTTACT 
ATCATAGGAC 
ATATTATTGT 
TCACTGTATC 



TATASATATA 
CAATACCCAA 
GTGTTCTGCT 
AQACAGAATC 
AAATGTTACC 
CTATGAAQAG 
GAAAATGAAA 
AGAGGAGCGA 
OOGCCTGATT 
TATCTCGQTT 
TGATTTTGAC 
GGCCTTGGCA 
ACCCX»AAGC 
ACTTCGAGTG 
GGATGAAAAG 
CATCATTTTT 
CTTGCTTGTT 



CTCTGGCATT 
AAGAT6CCTT 
TTCTCAGTCA 



GTAAAGAATG 


ACTTTCCACC 


7GCTTGTCAG 


1440 


TTCCAGCAGC 


AACAGCATCA 


AATACTTGCC 






CAAATGAATS 




15£0 


ATAGGAAGTG 


GAGGTTCAAG 


CAAGGTATTT 


1620 


GCTATAAAAT 


ATGTGAACTT 


AGAAGAAGCA 


1680 


GAAATAGCTT 


ATTTQAATAA 


ACTACAACAA 


1740 


TATGAAATCA 


CGGACCAGTA 


CATCTACATG 


1800 


AGTTGGCTTA 


AAAAGAAAAA 


ATCCATTGAT 


1860 


ATGTTAGAGG 


CAGTTCACAC 


AATCCATCAA 






TGATAGTT6A 


TGGAATGCTA 


1980 


ATGCAACCAO 


ATACAACAAG 


T6TTGTTAAA 


2040 


CCACCAGAA6 


CAATCAAA6A 


TATGTCTTCX: 




ATAAGCCXCA 


AAAGTQATGT 


TTGGTCCTTA 


2160 


AAAACACCAT 


TTCAGCAQAT 


AATTAATCAG 


2220 


AATCATGAAA 


TTGAATTTCC 


CGATATTCCA 


2280 


TGTTTAAAAA 


GGGACCCAAA 


ACAGAGGATA 


2340 


GTTCAAATTC 


AAACTCATCC 


AGTTAACCAA 




TATGTTCTGG 


GCCRACTTGT 


TGGTCTGAAT 


2460 


ACTTTATATG 


AACACTATAG 


TGGTGGTGAA 


2520 




GGGGAAAAAA 


ATQA 




31 


41 


SI 




1 

DLTDELSLNK 


i 

ISADTTDNSG 


1 

TVNQIMMMAN 


60 


YSQAIEALPP 


DKYGQNESFA 


RIQVRFAEiiK 


120 


QFELSQGNVK 


KSKQLLQKAV 


ERGAVFIiEML 




AQBSPSGSLG 


HLQHKMNSCD 


SHGQTTKARP 


240 


PGKVPVNLLN 


SPDCDVKTDD 


SWVPCFMKRQ 




QHSHFKEPLV 


8DSKSSBLII 


TDSITIiKNKT 


360 


RKSECINQNP 


AASSmiWQIP 


ELARKVNTEQ 


420 


CKTPSSNTUJ 


DYMSCFRTPV 


VKNDFPPACQ 


480 


SSSANECISV 


KGRIYSILKQ 


IGSGGSSKVF 


540 


EIAYLNXLQQ 


HSDKlIHIiYD 


YEITDQYIYM 


600 


KLEAVHTIHQ 


HOIVHSDUCP 


ANFI.IVDGML 


660 


PFEAIXDMSS 


8RBIGKSKSK 


ISPKSDVWSIi 


720 


NHEIEFPDIP 


EKDLQDVLKC 


OiKSDVKQKt 


780 


yVLGQLVQLN 


SPKSIUCAAK 


TLYEHYSGGE 


840 



AGGCXGC6A0 
CCGGTAACTG 
AGCTGCTACG 



GGGTCOOGCC GGCGCCCTCC CGAGGaaQGC TCAaaAGGAG GAAGGAGGAC COGTGCSAOA 



AGGTGGTTTC 
GCCTGGGGTC 
CAAGGGAGTC 
AAACAAATGC 
GTGTGGAATG 
GTGCTTTTGC 
ATGTGCCATG 
GTGTCCATCC 
AT6TGCCTCT 



TCTACAACAX TTCTAQAAAA 
TTATGATACT TCTTGGAAAC 
GGGTCTTTCA TAGCCAAACT 



2040 
2100 
3160 



434 
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3 NO! 649 Protein 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



11 



21 



31 



51 



I I I 1 I 

MPLPWSLAI.P LUiSHVAGGF GHAASARHH6 LLASABQPGV CHYGTKLACC YGHRRHSK0V 
CEATCEPGCK PGECVGPNKC RCPPGYTCKT CS^WNECGM KPRPCQHRCV NTHGSYKCFC 
LSGHMLMPDA TCVWSRTCAM IKCQtSCBDT EEGPQOiCPS SGLRIAPHGR DCLDIDECAS 
GKVICPYNRR CVNTPGSYVC KCHIGFELQY ISGSYDdDI MBCTMDSHTC SHHANCFNTQ 
GSFKCKCKQG VKGNGUICSA IPEHSVKBVI. RAPGTIKDRI FOOiUVHICMSM KKKAKIKNVT 
PEPTRTPTPK VNLQPFVYEE ]:VSRaG^rSKG GKKGNEEKMX EGLEDEXREG KAUCNDIEER 
SLRGDVFFPK VNSAOEFGIiI LVQRXAI.TSK tiEHKDZiHISV DCSFNKGICD WKQDREODFD 
NNPADRONAI GFYMAVPAUV GHKKDZGRUC UiU>DLQFQS HFdjLFDYRb AGDKVGKLRV 
FVKNSNMIUA HEKTTSBDBK WRIGKIQIiYQ 6TDATKSIIF EAERGKGKrS EIAVDGVMiV 



COGTCICAGQ TCCCTGGOGO 



Seq ID NO: 650 DNA. seqpience 
Nucleic Acid AccesBion #: NM_ 
' Coding sequence: 259.. 2379 



GCAGCTCCAG 
TTAGACGGGG 
CTCATTTTCA 
ATCTTTGGAT 
ATCAGGARTT 
CTCCTARGAG 
ATCGCCTAOV 
GCGGTGGAAA 



GGAAAGCCTG 
GGGGRTCTTC 
TGAAGAAAAT 
GGCACAGTCT 
ACATGACGTT 
TGGAGCATTT 



GTAACTTTTG 
AGAGAC3VTTG 
TTTCTGGGAA 
CTAGAGTTTG 
TTCACATTCC 
ATATATTACT 



CTGAGOAaCT 
ATCCACACAC 
GATTTTGGTG 
TTGACCAGTG 
CAAAAAGTTT 
TTACTTTTTT 
CTGTCTQTTA 
CAGCCTGCAA 
AAAATAASGC 



21 
1 

AACCCCGGAQ 
ACAGOGGCCT 
AAAATGAGTA 
TGAGGATGCA 
GGAGATGTTT 
CTTCACCTGT 
TTTCCCTAAT 
TCTTCCXCTC 
TSTACCAACC 
AGTATATTCT 
TGAATGTGAC 
AGAATTTCTT 
TCXaiAGGCAT 
TGCGCCTCCA 
TATTGGAACA 
AATTGATGTT 
CAGCATTGTA 
TAAGGCAGAT 



TOGACCGCCC 
AAATAGTGAA 
AAGAGTGATT 
ACATTTTTGT 
GAACCAATTA 
CTGATGGGTC 
GCAAATCIGG 



CCCGAGTAAT TGACCCaGGA 120 



TGGAGTTGTG 
CCAGGTTTCC 
GGAGTTTGCT 
CTGTGCCTTT 
CATGTTCGAC 
ATTCGAATTG 
TACGTCTATG 
CGTCAGTACC 
TTATTTATGA 
GGAAGCAAAA 
CCAATCAGTG 
TCTAAAGTTA 
TCCAAATCCA 
ATTACTAGCC 
ACATCAATGA 
TGTGGTGAAC 
GGSAAOGGCC 
AAGAGTGATA 
TCAGAACCAA 



GTGTGTTTGT 
AAGTCATACA 
GAGTCTTCAG 



ATATCCCATG 
TAAARTACCT 
AGACATGCAC 
AAAGTCGAAG 
AACACAAAAA 
TGGGAACCAQ 
ATGATTACXrr 
GAOAGGTOAA 
CTGCCTCQCC 



GATTCTTACC 
GCAAAAAGCA 
GCTTCTTGCT 
TTATGACCTG 

ACATQATGGC 
CX3GCTTGTAT 
CAGGATTACC 
TCCTTATCAG 
GATGACATTA 
AOAATGGGCT 
AGTACTACAQ 



AGATTACRAT 
GGTCCTCAGA 
CTTAAGACTT 
TGCCCCAACA 
GTTTCAATAT 
AGAAGATTCA 
TCTCTTATGT 
GAGAAGCTAG 
TTGTTCATGC 
ATTACTTGGT 
GTGTGGTTTC 
CTGAACAAAG 
QATGCTTCTC 
CTTCTTTTAG 
CGGAACCAAG 
CTTGTGCCAT 
TGGGAGATAA 
QCAAAAGCAA 
ATTGTTGGCA 



ATGAGGAATT 
CATCXaAGCC 
TGACGTGTAT 
CTGTTCCCAG 
ATTATGACCA 
AATGTTCACC 
AAATTCATGT 
AATTAATTGA 
ACTGTGATGA 
AQAAAACAGA 
CTGGGGGACA 
TGTATTTTAA 



TGAACATTTT 180 
ATGTGGTAAA 
TTTTCTACCC 
ATGTATGAAA 
GAGTATTGCC 
AAACATTGAA 
GGTTCCACCT 
CACTTTrGGG 
GACTGTTCCT 
ACAAGTCCAA 
AGGATATAAG 
AAGTGATGAG 



ACTTCRTTGG 
AACTTGGTGA 
TTTTGTATTT 
TCTTAGCTGC 
AT6CTGTTGC 
TTGAAGGAGA 
QCTACTTTGT 
CTGGCATTAT 
AAAAACTAAA 
TAGTGACACT 
CTTGGGTCTC 
AAGCTCX5ACC 
TCTCTGCTGT 
AACX»AATCG 
QAATCATOXQ A8TTTTTCTT 



GAGACCAATT 
ATTTTTGCTG 
CACTGTTGTC 
TTTCACAATG 



aV OAAB CAAA 
TAOGTTCTTC 
TCAAGAATAA 
AAATGTGCAG 
CCTTTTCTAT 
TTTACCTTTT 
GTATCTTTTT 
ATTCAAGTAT 
ATTTCTAAGA 
QTCTTATAAT 



TTTSrOTTAC 
TTTTGCACTT 
TATGACTCAT 
GTTAATAATA 
TTATGAAGAT 
TGATATAAAA 
ATACATATTT 
TTTTATCATG 
AAATTGTAAA 



AOCAGCATCC 
TGTATCTGAA 
TGGCCTGGCA 
AGQTTCCACA 
TTOTCATTCA 



ATGGCACTTC 
AAATCCAAAC 
CCAGGTTAAG 
TCTCTGGGGA 
GTaAAGGAAO 



TACATTTTGT 



ATASTCTOST 
CAAAQAGTGT 

taaaatgtcc 
aattgacttc 

AOTTOCTTAT 
ATTATACAGT 



:aat 

TCTACTCTTQ 
TCAAGATATT 
GAAAATAAGC 
CTATTGTGAX 
ATAaXCTTCT 
CTTTAAAAAC 



ACAGCAAATC 
ACTTTGACA6 

GcrcAsscaccc 

ATCTCCAGAC 
AGT6CX3CGGA 
CAGAGCAACA 
TCTCTGCTTG TTCACCCAGT 
GATACTTQAA GAACATTTTC 
ACCTA1GCAC TGTTTTOTAA 
TATACTGOAA 
AACAATATAC 
GACAGAGTTA 
GTAAGAGTAT TTTAAGATGT 
TCTTTGCTGA AGTATTTAAA 
TTATATGTAT TTGAACTTTT 
ATTTTAGCAC TTTGGTAGCT 

TirrATAcrGT 



ATGGGGAACA 
CAACATTAGT 
ACTCTTGCCA 
TTCCTTAAAT 
GAAATTTATG 
TCTCGGATGT 
TGATCATTGT 
AGAATTGGCT 
CTTCTGGGTT 
CAAGAGAGAT 
AAAQCACAAT 
6AAGGTCATT 
TGCASTAGCA 
CTCACCAGAA 

ACAGGTCGAC 
GATTAGTCCA 
CCCCAGTTCT 
TTCAGGAGTG 



CCACTATIGA 
TAAA GGGTT A 
CCTTTTTTAA 
ATTTTTTOTT 
ACCTTTCICA 



TCACA6ATCT 
TTGTATTATG 
GTAGACAAAA 



CTGCTCACTG ATCCTTCTGC 



1320 
1380 
1440 

1500 



1740 
1800 
1860 
1930 
1980 
2040 
2100 



GAATCACTGT 2460 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



CFGAAAACAG 
GAGGAATCTT 
ACTATQCTAT 
rCTTATCCTT 
TTGAAATCCT 
TTTACACTGA 
ATACCAAAAA 
ATCTAAAATG 



85 



435 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 



LFTCEPITVP 
FVPTCIEQIK 
TEFLGPQKKT 



WO 02/086443 

MiMFTFLLTC IFIiPLLRGHS 
FLPLANLECS PNIETFLCKA 
LECDRI.QYCE ETVPVTFDPH 
CAPPCPNMYF KSDELEFAKS 
YSIVSLMYFI GFLU3DSTAC 
VILTITWFLA AGRKWSCEM 
LYDLDASRyP VLLPI/CLCVF 
SGLYLVPLVT LLGCYVYEQV 
LMTLIVGISA VFWVGSKKTC 
KKHYKPSSHK I,KVISKSMGT 
KADGASTPRL REQOCGEPAS 
TGIAQSNMIiQ VPSSSBPSSL 



Seq ID MO: €S2 I 
Hucletc Acid Accession #i N 
Coding sequence: l71..212e 

21 



RCKKMAYNMT 
WPPCRKLCE 
EQVQRDIGPW 
CATLFTFLTP 
DTWLGSQMK 
AWGTPGFLTV 
ISUmVRQVI 
SOHCRQYKIP 



FFPNLMC5HYD 
KVYSDCKKLI 
CPRHIiKTSGG 
LIDVRRFRYP 
ACTVUMLLY 
MLLALNKVEG 



CPYQAKAKAR 



QSIAAVEMBH 
DTFGIRWPEE 
QGYKFUSIDQ 
EHPIIYYSVC 
PFTMAGTVWW 
DNISGVCFVG 
KKFMIRIGVF 
PELAbPMIiar 



PCT/US02/12476 



EQKAVWPHAV 
VOI.SLLIA6I 
NRITWErTNV 
TEWAGPFKRN 

STGATANHGT SAVAITSHDY LGQETLTGIQ TSPBTSMRBV 600 
PAASISRLSG EQVDGKGQAG SVSESARSEG RISPKSOITD 660 
KGSTSLIiVHP VSGVRKEQGG GOiSDI 



ATGATGAACT 
AGGTCAAACT 
AAAACACACT 



TGGTTCTTGA 
TGTCAGAAGA 
ACAGCCAGGG 
ATAAATTAAA 
ATCTACAQAC 
CATATCTTGQ 
GT6QATTTCT 
GAAAATATGA 



11 

I 

3 GAAGCGGCCA 
TCAGGACAGC 
AGGTTCTTTT 
TCTCAAATAT 
TGCCTGCCAT 
AGGGAGTGAT 
GCATATATGT 



I 

CAACCOGGCG 
AGGCCCCTGT 
TCTAATTCCA 
TATGAATTAC 
ATCCTTACTG 



TOCAAGATTA 
ATGATTGOOT 
TAATTTCACT 
AGGCTCX3GGG 
CTACCCCATT 



GGAGACCCGG 
CTATGCrCAC 
GCTGATTGAC 
ATGCTGTQGQ 
ATCAOAGGCA 
ACCATTTGAT 
TGITCCCAAO 
CCCAAAOAAA 
CAACTATCCT 



CAACTCTACC 
GGAG6A0AGC 
GTTGTCTTCC 
AGGGACCTCA 
TTTGGTCTCT 
AGTCTGGCTT 
GATQTTTGOA 



QTGOCAOrAT 
AAAACCAGTT 
CACAGACATC 



TGGCTCTCTC 
CGGATTTCTA 
eTTQAQTGGC 
TCTGTACATC 
QATCACCTCA 
CGTTTAAGGC 



QAaAGATGGT 
TCAAAACiGGA 
ATGTGCTAGA 
TGTTTOACTA 
GTCAGATAGT 
AGCCAGAAAA 
GTGCAAAACC 
ATGCAGCACC 
QCATGGGCAT 
TAATGGCTTT 



CGCOGTACCA 60 

AQCCGTGCCC 120 

ATGAAAGATT 130 

GQCTTTGCAA 240 

AGCTATAAAA ATCATQGATA 300 



TATAATTTCC 
ATCTGCTGTT 
TTTGCTGTTT 
CAAGGGTAAC 
TGAGTTAATA 
ACTGTTATAT 
ATACAAGAAG 



TGAAAAATCT 
AAAGCAAGAA 
ACAGAAACAA 



ATAAAAATTA TOTGGGGGGA TTAATAGACT 
QTGCTGCTAC 
AATCTAAATC 

AAAATGTATA TACTCCTAAS 
AGTTAATAAG 

CTCAAAAGCT AGAAACCAGT 
AGGAACAGAC AAGTTAATGA 
AACCAAGCAC 
GAAAGGGGGT 



TTTCTTCTTT 
ATTGGAGTCT 
ATQATTGGXG 



ACACTACACC 



CXACAACTAG 



TGGGAGCCTT 
GGGTTCTGCC 
ATTAOTGAAX 



TTGQGAAAGT GACAATGCAA 
TGGGTATCAG GAGGCAGCGG 
ACATCCTATC TAGCTGCAAG 
GGTGTGATAC AGCCTACATA 
CTACCAACTT GTTTCTAAAG 
GATATTATTT TGTGTATGAA 
TAATCATGTG GTTTTGTATA 
AATGTAAGCT CTTAACTATG 
TTCTQAATAT 



CCAGATCAAC 
CAAAAGGGTT 
TTTGAATTA6 
CTTAAGGGCG 
GTATAATTGA 
AAGACTCTTA 
AGCTATCTTA 
TCTAAATCAA 
TTAATAATTG 



AGAATGAAGA 
AGAGAGAAAr 
GCCTGAAAGA 
CAGGTGTCAT 
ATATGGAGGA 
TGQATAAOGT 
CCAGAAGACT 
TGTTGAATGA 
ATACACTGAA 
AAGTGTGCCA 
ATGCCTGGGT 



ATT6AACCAT 

TCCrrTTATT 
CAGGCAAACA 
TCTTCTGCTT 
CTCCTGTGGA 
GGAAGATGTG 
TGAAGATQA7 
GACAGAATCA 
AAATAAATTA 
GTACTTTAXG 



TGATCGCTTT 
AGACCAATAT 
GCCCATCTGT 
TTGACTTTCT 
ATGTGTAATT 



AACTCCAATT 
TAGCCXTTGAG 
GACTCCAAAA 
TATCACTGTG 
AAAGCTTCAC 
AATAATGTCT 
GTGTCAAACA 
GCTTCAAAAA 
TTACAAAAGA 
CATCCTGCCG 
GATTTTAAAG 



GCTTATGTGC 540 
QATGAATATC 600 
AAQQATTACC 660 
CAAGGCAAAT 720 
GTTCTTATGT 780 
ATTATGAGAa 840 
CAACAAATOC 900 
CCCTOGATCA 960 
CACCTCGATG 1020 
ATGGAGGATT 1080 
CTAGCCaUMJA 1140 
CAAGCCAGTG 1200 
ACCGCAAGTG 1260 
TTATCAACAG 1320 
1380 
1440 
1500 
1S60 

16B0 
1740 
1800 
1860 



CATTATGTTA 

TAGATTcawrr 

TCTTTCTGAA 



ATTCTTCCAA 
C3VGTCAGATT 
CCCQATGTGG 
TTAGTGGAAG 
GATGAGTGTG 
TTCATTGOAA 
TTAAACAAAA 



2220 
2280 
2340 
2400 
2460 



70 
75 
80 



I 

MKDYDEUiKV 
LKHLRROHIC 
AYVHSQGYAH 
QGKSYUSSEA 
QQMLQVDPKK 
MEDLISLWQY 



DVWSMGIIJ.Y 

DHLTATYLLL 
LXDYDWCEDD 



21 
1 

) GFAKVKLACH 
I KIFMVLEYCP 
OBYHKIiKIjIO 
VLMCGFtPFD 
PWIMQDYNYP 



I.STGAATPRT 
FFBPKTFVNX 
RRCRSVEIiDIt 



DDNVMALYKK 
VBWQSKMPFI 
RLSIiSSFSCS 



KDYHLQTCCO 
IMRGKYDVPK 
HLDODCVTEL 
QASATPPTDI 



LPRIKTEIEA 
WPRQIVSAV 
SLAYAAPELI 
WLSPSSILLL 



SQPTKYWTES NSVESKSLTP ALCRTPANIO. 420 
PNRYTTPSKA SNQCLKETPI 480 



KIFVNSTGTD KLMT6VISPE 
LTRSKRXGSA ROGPRRLKLH 
QSDFGKVTMQ PBLEVCQLQK 



Seq ID NO> 654 DNA Bequence 
Kudeic Acid Accession #i MM_0OOSa2 
Coding sequence: 88.. 990 



NQKKREII.TT 
NQAHMEBTPK 
PDQLUNEIKS 
PDWGIRRQR LKGDAHVYKR LVEDILSSCK V 



436 



wo 02/086443 

GGCATCACCT GTGCCATACC AGTTAAACAG GCTGATTCTG GAAGTTCTGA GGAAAAGCAG 180 
CTTTACAACA AATACCCAGA TGCTGTGGCX: ACATGGCTAA ACCCTGACCC ATCTCAGAAG 240 
CAGAATCrCC TAGCECCACA GACCCTTCCA AGTAAGTCCA ACGAAAGCCA TGACCACATG 300 
GATGATATGG ATGATGAAGA TGATGATCAC CATGTGGACA GCCAGGACTC CATTGACTCG 3e0 
AACGACTCTG ATGATGTAGA TGACACTGAT GATTCTCACC AGTCTGATGA GTCTCACCAT 420 
TCTGATGAAT CTGATOAACT GGTCACTGAT TTTCCCAOGG ACCTGCCAGC AACC6AAGTT 4 BO 
TTCACTCCAG TTOTCCCCAC AOTAGACAOV TAiaATOGCC OAGOTGATAa TGTGGTTTAT 540 
GGACTGAGGT CAAAATCTAA QAAGTTTCGC AGACCTGACA TCCAOTACCC TGATGCTACA 600 
GACX5AGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA ATGGTGCATA CAAGGCCRTC 660 
CCCGTTGCCC ASGACCTGAA CGCGCCTTCT GATTGGGACA GCCGTGGGAA GCACAGTTAT 720 
GAAACGAGTC AGCTGQATGA CXrAGAGTGCT GAAACCCAC3V GCCACAAGCA GTCX3VGATTA 780 
TATAAGCGGA AAGCCAATGA TGAGAGCAAT GAGCATTCCG ATGTGATTGA TAGTCAGGAA 840 
CTTTCCAAAG TCAGCCGTGA ATTCXaiCAGC CATGAATTTC ACAGCCATGA AGATATGCTG 900 
GTTGTAGACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA AATTTCGTAT TTCTCATGAA 960 

TTAGATAGTG CATCTTCTGA OGTCAATTAA AAGGAGAAAA AATACAATTT CTCACITTGC 1020 

ATTTAGTCAA AA6AAAAAAT GCTTTATAiSC AAAATCAAAO AGAACATGAA ATGCTTCTrC 1080 

CTCAGTTTAT TGGTTaAATO TCTATCTATT TGAGTGTGGA AATAACTAAT QTQTTtOATA 1140 

ATTAGTTTAG TTTQTGGCTT CATGGAAACT CCCTQTAAAC TAAAAGCTTC AGGGTTATGT 1200 

CTATGTTCAT TCTATAQAAG AAATGCAAAC TATCACTOTA TTTTAATATT TGTTATTCTC 1260 

TCATGAATAG AAATTTATGT AGAAGCAAAC AAAATACTTT TACCXa.CTTA AAAAGAGAAT 1320 

ATAACATTTT ATGTCACTAT AATCTTTTGT TTTTTAAGTT AGT6TATATT TTGTTGTGAT 1380 

TATCTTTTTG TGGTGTGAAT AAATCTTTTA TCTTGAATQT AATAAGAATT TGGTGGTCTC 1440 

AATTGCTTAT TTGTTTTCCC AOGGTTGTCC AGCAAITAAT AAAACATAAC CTTTTTTACT 1500 
6CCTAAAAAA AAAAAAAAAA AAAA 



1 11 21 31 41 51 

I I I I I I 

MRIAVTCFCL liGITCAIPVK QADSGSSEEK QIiYNKYPDAV ATWIiHPDPSO KQNLIAPQTL 
PSKSNGSHDH MDDMDDEDDD DHVDSQDSID SNDSODVDDT ODSHQSOESK HSDESDELVT 
DPPTDLPATB VFTPWPTVD TYDGKaoSW YGLESJCSKKP RRFOIQYPDA OBEDITSHMB 
SEEUIOAYKA IPVAQDLNAP SDVfDSRGKDS YETSQLDDQS AEIHSRXQSR LYKIUOUIDES 
NEHSDVIDSQ EIiSKVSREEH SHEFRSHBDM LWDPXSKEE DKHUKFRISH GIiOSASSBVK 

Seq ID NO) 656 DNA sequence 
Nucleic Acid Acceeaicn «: NM_003108.1 
: 76.. 1401 



I 11 21 31 41 51 

I I I I I I 

GGGGTCGGAG OGaQAGaGGa ACCTCCGCAC GAGACCCAiSC GGCCCGGGTT GGAGCGTCCA 60 

GCCCTOCAAC GQATCATGGT GCAGCAGGCG GAOAGCTTGa AAGOGGAGAG ta^CCTGCCC 120 

CGGGAGGCGC TGGACAOSGA GGAGGGOSAA TTCATGGCTT GCAGCCCGGT QGCCCTGGAC 180 

GAGAQCGACC CAGACTGGTG CAAGACGGOG TCGGGCCACA TCAAGCGGCC GATGAACGCG 240 

TTCATGGTAT GGTCCAAGAT CGAACGCAGG AAGATCATGG AGCAGTCTCC GGACATGCAC 300 

AACGCCGAGA TCTCCAAGAG GCTGGGCAAG CGCTGGAAAA TGCTGAAGGA CAGC6AGAAO 360 

ATCCCGTTCA TCCGGGAGGC GGAGCGGCTG CGGCTCAAGC ACATGGCOGA CTACCCCGAC 420 

TACAAGTACC GGCCCCGGAA JUVAGCCCAAA ATGGACCCCT CGGCCXAGCC CAGCGCCAGC 480 

CAGAGCCCAG AGAAGAGCGC GGCCGGCGGC GGCGGCGGGA GCGCGGGCGG AGGCGCGGGC 540 

GGTGCCAAGA CCTCCAAGGG CTCCAGCSUVG AAATGCGGCA AGCTCAAGGC CCCCGCGGCC 600 

QCGGGCGCCA AGGCGGGCGC GGGCAAGGCQ GCCCAGTCCX3 GGGACTACGG GGGCGCX3GGC 660 

GACGACTAOG TGCTGGGCAG CCTGCGCGTG AGCGGCTCGG GCGGCGGOOG GGCGGGCAAG 720 

AOGGTCAAGT GOQTGTTTCT GGATGAGGAC GACJOAOJAOa AOSAOaAOGA CXMCOAGCTG 7B0 

CAGCTGCAGA TCAAACAGGA GCCGGAOGAG GAGGAC6AQG AACCACCQCA CCatOCAQCTC 840 

CTGCAGCCGC CGGGGCAGCA GCCGTCGCAG CTGCTGAGAC GCTACAACGT OGCCAAAGTG 900 

CCCGCCAGCC CTACGCTGAG CAGCTCGGCG GAGTCCCCCX5 AGGGAGCQAG CCTCTACGAC 960 

GAGGTGCGGG CCGGOGCGAC CTCGGGCGCC GGGGGCGGCA GCCGCCTCTA CTACAGCTTC 1020 

AAGAACATCA CCAAGCAGCA CCOGCCX5CCG CTCGCGCAGC CCGCGCTGTC GCCCGCX3TCC 1080 

TCGCGCTCGG TGTCCACCTC CTCGTCCAGC AGCAGCGGCA GCAGCAGCGG CAGCAGCXX3C 1140 

GAGGACX3CCG AOGACCTGAT GTTC3GACCTG AGCTTGAATT TCTCTCAAAG CGCGCACAGC 1200 

GCCAGCGAGC AGCAGCTGGG GGGCX5GCGCG GCGGCCGGGA ACCTGTCCCT GTCGCTGGTG 1260 

QATAAGGATT TGGATTCGTT CAGCGAGGGC AGCCtGGGCT CCCACTTCGA GTTCXXXX3AC 1320 

TACTGCACGC CGGAGCTGAG CSAQATGATC GOGGQGGACT GGCTOQAGGC OAACTTCTCC 1380 

GACCTGGTGT TOVCATATTQ AAAGGCGCXTC GCTGCTOGCT CTTTCTCrCQ GAOGGIGCAQ 1440 

AGCTGGGTTC CTTGGGAGGA AGTTaTAGTG OTGATQATOA TGATGATGAT AATQATGATG ISOO 

ATGATGGTGG TGTTGATGGT GGCGGTGGTA GGGTGGAGGG GAGAGAAGAA GATGCTOATG IS 60 

ATATTGATAA GATGTaJTGA CX5CAAAGAAA TTGGAAAACA TGATGAAAAT TTTGGTGGAG 1620 

TTAAAGTGAA ATGAGTAGTT TTTAAACATT TTTCCTGTCC TTTTTTTGTC CCCCCTCXCT 1680 

TCCTTTATOQ TGTCTCAAGG TAGTTGCATA CCTAflTCTGG AGTTGTGATT ATTTTCCCAA 1740 

" AAAATGTQTT TTTGTAATTA CTATTTCTTT TTCCTGAAAT TOGTGATTGC AACAAAGGCA 1800 

6AGG6GGCGG CGCGGCQGAQ GGGAGGTAGG ACCOQCTCOO GAAGGCGCTG TTTGAAGCTT 1860 

GTCGGTCTTT GAAGTCTGGA AGACGTCTGC AGAGGACCCT TTTGGCAGCA CAACTGTTAC 1920 

TCTAGGQAGT TGGTGOAQAT ATTOTTTTTT CTTAAGAGAA CTTAAAQAAC TGGIGATTTT L980 



KIERRKIMEQ SPDNHHAEIS KRIiOKRHKMI. XDSBKIPFIR EAERLRUOIM ADYPUYXXItP 
RKKPXHDFSA KPSASQSPBK SAAG6QGGSA GBGAGGAKTS K6SSKKC6KL KAPAAAQAKA 
GAGKAAQSGD YGGAGDDWI. GSUIVSGSGG GQAiGXTVKCV FU3EDDDDDD DDDELQLQIK 
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QBPDEBDEEP PHQQI.U3PFa QQPSQLIARY NVAICVPASPT 
ATSGAjQGGSR LVYSFKNITK 0IPPPLAQPA LSPASSRSVS 
IMFDI.SUIFS QSAHSASEQQ LGaOAAAGNti SI.SLVDKDU> 
LSEMIAfiDWI. EANFSDLVFT V 

Beg ID lIOi 658 OKA sequence 
nucleic Acid Accession fts 1JM_001719 
coding sequence: 123.. 1418 



ASLYDEVSAG 300 



PCT/US02/12476 



GGGOGCAGCX3 GGQCCCGTCT GCA6CAAGTG 
CTGCXaCCTQ 
CXSATGCAOST 
CCCTGTTCCT 
GCTTCATCCA 
CCATTTTGGG 
CXaiTGTTCAr 
GCCAG(3GCTT 



I 



CTTGCCCCAG 
GCTGQACCTQ 
CTCCTACCCC 
TAGCCATTTC 
TGGAACATGA CAAGGAATTC 
CCCAGAAGGG 
ACGCTTCGAC 
CAGGGAATOQ 
GCTGGTGTTT 
GGGCCTGCAG 
CCTGATTGGG 
CACGGAGGTC 
CTCCAAGACQ 
CAGCGACCAG 
CTGGCAGGAC 
TGCCTTCCCT 
OCACTTCATC 
CATCTCOGTC 



TACAAGGCCG 
CTCACCGACG 
TTCCACCCAC 



ACCX5ACX3GCC GGGACGGCOS CCTGCCCCCT 
CCGGAQCCCG GGTAGCGCGT AOAGCCGGCQ 
CGCCGCACAG CTTCGTGGCG CTCTGGGCAC 
ACTTCAGCCT GQACAAOJAG GTGCACTOGA 
AGOGGOGGGA GATGCAGCGC GAQATCCTCT 
CQCACCTCCA GGGCAAQCAC AACTGGGCAC 



CCTCTGGCCA 480 

GTCAACCTCG 540 

CGGTTTGATC 600 

TACAAGGACT 660 



ACATCCGQGA 
AGCACTTGGG 
AGGAGGGCTG 
GGCACAACCT 
AGTTGGCGGG 
TCTTCAAQGC 
GCCAGAACCQ 
AGAACAGCAG 
GAGACCTGOG 
AGGGGGAGTG 



AATGAGACGT 
GATCTCTTCC 
GACATCACAQ 
CTCTCGGTGG 



CATCAGCTTC 
TCGABAGTTC 
ATTCCGGATC 
C6TTTATCAG 



CCACCAGCAA 
AGACGCTGGA 
CCCRGAACAA 



CCXaAGAACC 
AGGCAGGCCT 
TGGATCATCG 
CTQAACTCCT 



CTCTACTTC8 



AGGAAGCCCT 
GTAAGAAGCA 
CX3CCTGAAGG 
ACATGAACGC 
CGGTGOOCftA 
ATOACAGCTC 



CCACTGGGTG QTCAATCCGC 
ATCAACCCCA 
ATGGTGGCTT 
AAACAGCGCA 
AACGTGGCAG 
GTCAGCTTCC 
TACTACTGTG 
GCCATCGTGC 



GCAGCCCTTC 
CACGGGGAGC 
GCGGATGGCC 
CGAGCTGTAT 
CTACGCCQCC 
CACCAACCAC 
GCCCTGCTGT 
CAAOSTCATC 



CIGAAGAAAT 



TTGGGGCCAA aTTTTTCTGQ ATCOTCCATT QCTCGCCTTQ GCCAGGAACC AGCAGACCAA 
CTGCCTTTTG TGAGACCTTC CCCTCXXTTAT CCGCAACTTT AAAGGXGTGA GAGTATTAGG 
TUUVCATGAGC AGCATATGGC TTTTGATCAQ TTTTTCAGTG GCAGCaVTCCA ATGAACftAGA 
TCCTACAAGC TGTGCAGGCA AAACCTAGCA GGAAAAAAAA ACAACGCATA AAGAAAAATG 
GCCX3GGCCAG GTCATTGGCT GGGAAGTCTC AGCCATGCAC GGACTCGTTT OCAGAGGTAA. 
TTATCAGCX3C CTACCAGCCA GGCCACCCAQ COGTGQGAGQ AAGGOGGOGT GGCAAGGGGT 
GGGCACATTG GTGTCTGTCC QAAAGGAAAA TTGACCCGGA AGTTCCTGTA ATAAATGTCA 
CAATAAAACQ AATGAATG 



I 



41 
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MKVRSLSAAA PHSFVAIiWAP LFliLRSALAD FSLDNEVHSS FIHRRLRSQB RREMQREILS 
IU3LPBRPRP HLQGKiaiSAP MFMUSLmiAM AVBEGGGPGG QGFSyPVKAV FSTQGPPLAS 
LQDSHFLTDA DMVMSFVHI.V BKDKEFFHPR YHHREFRFDL SKIPEGEAVT AAEFRIVKDY 
IRBRFOKETF RISWQVLQB HIiGRBSDIiFIj LDSRTLWASE EQWLVFDITA TSNHWWNPR 
HMLGLQLSVE TLDGQSIMPK LAGLIGHHGP QNKQPPMVAF FKATBVHFRS IRSTGSKQRS 
QNRSKTPKMQ EyU-RMASVAE KSSSDQRQAC KKHELYVSFR DLGWQDHIIA PEGYAAYYCE 
GECAFPLNSV MNATNKAIVQ TXjVHFXMPBT VPtCPCCAPIQ LNAISVLYFD OSSNVILKKY 
RNMWRAGGC H 



60 
65 
70 
75 
80 
85 



GGATCTGAGG 
GGGGCTTGGG 
GAGGAATTAT 



GTGCTTTTTC 
CACaGGTTCC 
CTTGTQCTGA 
GAAGGTAATT 
AAAATATCGG 
TTCCGACACT 
GCCAATTATT 
TTCTTTGAAC 
GCTGTGGCTA 



AGGCAGCCTG 
CTGATAAAAT 
CCCTOGAAAA 
TTTTCTCTTC 
TTGAACAGCT 



GTAGTCCATG 
CAAAATTCCA 
OTTQTQATGT 
TACCTOCATA 
ATCTTGATAG 
ACTCTGGCTG 
GCACCX3ATCT 
CTAGCTACCA 
AAACTGGCCA 



GTTTCCCTGA 
CTGTTCCATG 
GTAACCCCAA 
CAGACTGCCT 
GCCTCTATGT 
TTCTCATC7VT 
TATTTGTGTC 
CTCACATAGG 
TTGAGGCAAC 
TTATTTACTT 
ATCTCATCTT 
GCTGGGGGTT 
ATOOOXOGTa 
TAGCAGCTAT 
AAATCTGGGA 
AATC6ACACT 



21 
I 

C3«rrTCCTCc 

CTCTCCAGTC 
TCCTGGGTTA 
T6ACCTTTTT 
TTTTTCTAOG 
GGATTCTCAT 
ACAATGTGAA 
ATGGGATGGA 
CCCTCCTTAT 
TGGAACATG6 
TCGCTTTCTG 
AATGTATACC 
TGGTTACTTC 
TTTCATGCTQ 
AGTAAAGGAG 
TTCTGTGGAC 
CCTGGCTACA 
TOTG6CTTTC 



I 

ACGTTCTGGT 
CXTTATCCACC 
ATATTTTTAA 
ATGCTTCGAA 
ATAAATGAAA 
GGCACCATTA 
CTCAACATCA 
CTCATTTGTT 
ATTTATGACT 
GATTTTATGC 




AAACGGA6A0 
GCAGTTTGTC 
GCATTTCTTC 
CTATAGAGGR 
CAGCTCAACT 



GCAGATTGTC 
CXAGGAGGGA 
AACAGTGGGG 
AGGAGTTGCT 
TAAAACATGG 
AAAOCAAQAA 
GTTQGCTACT CCATCTCTTT TOGTrCCTTa 
ATIGCACTAO GAACTATATC 
GCATCTTTGT CAAAGACA6A 
TAATAATGCA GGATGACCCA 
ATATOGGGTG CAAGATTGCT 
GGATCCTGGT GGAAGGTCTC 
CCAAATACCT QTGGGGCTTC 



AGA6CTACAA 
CT6GAGTCCC 
AAATCACAAT 
AATXATTATT 



TCCAGCnOCA TTTGTTCCAa CATGGGCTCST GGCACX»GCA 
CTQQQAACTT 
TGGGCTGAAT 
GACCAATGCA 



TTTATTCIGT 
GTTGGGCATG 
CTAQTCTTTQ 



GAOTGCATTA CaTCGTGTTC 
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gtatgcx:tqc ctcactcctt 
ttcttcaact cctttcacgg 
gttcaggcag aggtgaagaa 

ACacCGCCAT GTGGOUSOXS 
AGCAGCChGT CACAGGTGGC 
AAOATOQCCA GCAQACAGCC 
TCAOAGChGG ACTGCCTOCX: 
CAGGGAGATG ATATTCTAAT 
OAAGGATGCC AAGGAOAAAC 

Seq ID NO: 661 Protei 



CACTGGGCTC 
TTTCTTTGTG 
GATGTGGAGT 
CAGATGCGGC 
GGCCAGCACA 
TGACAGCCAC 
ACACTCTTTC 
GCJAOAAGCCT 
TGAGGATGTT 



GGGTGGGAGA TCCGCATGCA 
TCTATCATCT ACTGCTACTG 
CX3GTGGAATC TCTCCGTGGA 
TCAGTGCTCA CCACCOTGAC 
CXSCATGGTGC TTATCTCXGG 
ATCACTTTAC CTGGCrATGT 



PCT/US02/12476 



CAATGGAGAG 
CTGGAAAAGG 
GCACAGCACC 
CAAAGCTGCC 
CTGGAGTAAC 



! 

MUSSLSTSI 
UHTAOLQEG 

HHliHCTRNYI 
KSQYIGCKIA 
FVAAWAVARA 
VGHDTRKQyR 
SIIYCYOJGE 
RMVLISGKAA 
SRFMESHPDT 



11 



21 
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VLFLFSSPST INESISSRKS HRFIiEOUJSD GTI7IESQTV IiVLKAKVQCE 
EQHCFPEWOG LICWFRGTVG KISAVFCPPY lYDFNHKGVA FRHCMPNGTH 
ANYSDCLRFL QPDISIGKQE PPERLYVMSfT VOYSISFGSI. AVAILIIOYP 
RMHLFVSFML RATSIFVKDS WKARIGVKB LESIiIMQODP QNSIEATSVD 
WMPIYPIAT NYYW11.VEGL YUiMUFVAF PSDTlOniWGF IlilGWGFPAA 
TLADARCWEL SAGDIXWIYQ APIIiAAISUT FZT>FUITVRV LATKIWETMA 
RIAKSTLVLV IiVFGVHYIVF VC3iPHSFTGI. GWBIRMHCEL FPNSFQGFFV 
VQAEVKKMWS RnHLSVDWKR TPPCGSRSCQ SVI.TTVTHST SSQSQVAAST 
KIASKQPDSK ITLPGYVHSN SEQDCLPHSP REBTKEOS6R QGDDILMEKP 



Seq ID NO> 662 DNA sequence 
Nucleic Acid Accession ft: NM_0 
Coding sequence: 143.. 1795 
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TCTTCCTACA 
GCTAATOCTC 
TATAGAGGAG 
AGCTCAACTC 
GCCCAGAOGA 
CAACCATAAA 



CAGCATAGQA 
CATCTCTTTT 
TTQCACTA6S 
CATCTTTCTC 
AATAATQCAG 
TATOSGQTGC 
GATCCTGGTG 
CAAATACCTG 
ATGGGCTQTQ 
CATCAAOTGG 
TCTGAATAOG 
CACAAG6AAG 
AGTGCATTAC 
CCGCATQCAC 
CTGCTACTGC 
CTCOGTGGAC 
CACCGTQACG 
TATCTCTGGC 
TGGCTATGTC 
CAAGGAAGAT 



CQCAGCTGCC 
CAGATTGTCC 
CAGGAGGGAQ 
ACAQTGGGGA 
GGAGTTGCTT 
AAAACATGGG 



AACTATATCC 
AAAGACAGAG 
GATGACCCAC 



GAAGGTCTCr 
TGGGGCTTCA 
GCACGAQCAA 
ATTTATCAAG 
GTTAGAQTTC 



CATTTQTGGC 



GGAQTAGTTT 
GCTCTGTGAT 
GCTGTAGCTT 



ATTTATTTTG 
GATCTAAGAA 
TTATAACAAT 
ACATCCCITC 



ATCGTGTTCG 
TGTGAGCTCT 
AATGGAGAGG 
TGGAAAAGGA 
CACAGCACCA 
AAAGCTGCCA 
TGGAGTAACT 
AGTGGGAGQC 
CCAGACACTG 
TOACTTTCAT 
GCTTGAGTTC 
CATGAATTGQ 
ATTACCTTCT 
TGTTCATTTT 
TCTCTCATAT 
TAGAAACTAQ 
CCTGTGCATA 



TCCTGGCCAG 
TTGTGCTGAA 
AAGGTAATTG 
AAATATCGGC 
TCCGACACTG 
OCAATTATTC 
TCTTTGAACO 

ACATGCACTT 
TAOTCCATGC 
AAAATTCCAT 
TTGTGATGTT 
ACCTGCATAA 
TCTTGATAGG 
CTCTGGCTGA 
CaCCGATCTT 
TAOCTACCAA 
AACTGGCCAA 



AGCCCAGcra 
^GCGAAAGTA 
TTTCCCTGAA 
TGTTCCATGC 
TAACCCCAAT 
AGACTGCCTT 
CCTCTATGTA 
TCTCATCATT 



GATTCTGATG 
CAATGTGAAC 
TGGGATGGAC 
CCTCCTTATA 
GGAACATGGG 



TCACATAGOA 
TGAGGCAACT 
TATTTACTTC 
TCTCATCm 
CTGGGGGTTT 
TGOGAGGTGC 
AGCAGCTATT 
AATCTGGGAG 
ATCQACACTG 
TCACTCCTTC 



ATGTATACaS 
GGTTACTTCA 
TTCATGCTGA 
GTAAAGGAGC 
TCTGTGGACA 
CTGGCTACAA 
GTGGCTTTCT 
CCAGCAGCAT 
TCGGAACTTA 



CACCGCCATG 
GCAGCCAGTC 
AGATCGCCRG 
CAGAGCAGGA 



TACATGTGTT 
TTT7GAATGG 
ACtaiTOTCaVT 



AAGGATGCCA 
GGGCTGGTCC 
AAAGGCTOAA 
CTCCTGTAAA 
ATTGGCATCA 
TTTCTGCTAC 
ATATCACCXrr 
TATTCTCTTA 
GGAGCAATTA 
CTGGAAAATT 
TTTGGQAACA 
CCTCTTTGTG 
GTGQAAAQAT 
TTTTGATAQC 



GGTGAAGAAG 
TGGCAGCCGC 
ACAGGTGGO? 
CAGACAGCCT 
CTGCCTGCCA 
TATTCTAATG 



ACCAATGCAG 
GTCCTGGTCC 
ACTGGGCTCG 
TTCTTTGTGT 
ATGTGGAGTC 
AGATGCGGCT 
GCCAGCACAC 



AAGTTTQCTC 
GOTCCCTOCT 
TCTGGGGTT6 
GC3VCCATTAC 
TCAACATCAC 
TCATTTGTTG 
TTTATGACTT 
ATTTTATGCA 
AGCXAGATAT 
TTGGCTACTC 
GACGATTGCA 
GAGCTACAAG 
TGGAGTCXCT 
AATCACAATA 
ATTATTATTG 
TTTCGGACAC 
TTGTTGCAGC 
GTGCTGGAGA 
TTATTCTGTT 
TTGGGCATGA 



AATGGCTGGT 
AATTCAGTTA 
TACXAACGAC 



CACTCTTTCC 
GAGAAGCCTT 
QAGGATGTTC 
TGTGT6AGA0 
AGGTGTTACT 
ATGAAAATGC 
TAAATTAATO 



GCATGGT6CT 
TCACTTTACC 
AC6AGGAGAC 
CCftGGCCTAT 
TCTGAATGGA 
GGCTTGQCTG 
TAATAATAGT 
AAGIGTCAAT 
TATGGTATTT 



TTTTOOGTAS AAAAAAQATT CAATTCCTia 



AAATATAATG 
TTTCTTACTT 
GGATCTAAAA 
AGTTGGCTGG 
AGGAAAATTT 
ACCAGCX»GA 
TTCCTCAQTT 
AAATCKIGCT 



AAGATCTTTT 
TAATGTACTT 
AAATATATGG 
ACATTGATAA 
CTCAAAAAAG 
CCTCAGGTCT 
AGTOAGCna 
GCATCTATAT 
TTAAAAATTT 



TGTCTGCAAA 
CTTTTTCTTG 
GTTTTAAAAA 
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KAGUSASLHV HGHLMLGSCL lAKAQUDSDG TITISEQZVIi VUCMCVQCBIi NITAQIOEOE 
85 <aiCFPEMDGI. ICWFSGTVGK ISAVPCPPYI YDPHHKCVAF RHCNPNGIWD CMHSLHXTWA 
NYSDCLRFLQ POISIGKQEF FERIiYVMyTV CnrSISFOSIA VAILIIGYFR RlfCTRNYIH 
MRt>FVSFMI.R AT8IFVKDRV VHABI6VKEI. ESIiIMQDDPg NSIBATSVDK SQYZGCKIAV 
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VMPIYFIATO YYWILVEGLY 1«HI.XFVAFP SDTKYLHGPI LIGWGFPAAF VAAWAVARAT 
LADARCHEIiS AGDHmiVQA PILAAIOUfF ZLFI.NTVRVL ATKIHBTMAV GHDTRKQYRK 
LAKSTIiVLVI. VPGWHVIVFV CLPHSFTGLO WEIBMHCELP PNSFQGFFVS IIYCYCNGEV 
QAEVKKMWSH WNLSVDKKRT PPCGSRRCX3S VLTTVTHSTS SQSQVAASTB MVLISGKAAK 
lASRQPOSHI TI.FGYVWSNS EQDCbPHSFH BETKEOSGRQ GODILMEKPS RPMESHPDTE 
GCQGETEDVL 

Seq lO HO: 664 DNA sequence 
Nucleic Acid Accession »: KM_012152 
Coding sequence I 43.. 1104 



PCT/US02/12476 



CTTCTTTAAA 
GACAAGCACA 
GOAACAAAGC 



CTGCTCATTT 
TGGAATTGCC 
TACCTTGTTT 
CTGCGGATCT 
TCCATCAGCC 
GaSTTTGTGG 
AGGCAGTGTG 
GTCGTGAACC 
ATGATCTGCT 
GTCCTCAGCA 
GTCTGCAATA 
GTCTTAGG 



TGGACTTTTT 
TTGTGATTGT 
TGGTCATCGC 
CTAATTTAGC 
CAGGCCCftGT 
ACAGTAGCTT 
CAATCATQA6 
TGCTTGrCTG 
rCTGCAACAT 
TCTGGACAGT 
ACGTGTACGT 



21 

I 

V GGATGTTCAC 
TTATAATAGG 
TTTGTGTGTT 
GGCAGTGATC 
TGCTGCCGAT 
TTCAAAAACT 
QACIGCTTCC 
QATGOGGGTC 



TTCTTCTCCA 
AGCAACACTG 
GGGACGTTTT 
AAAAACAGAA 
TTCTTCGCTG 
TTGACTGTCA 
CTCACCAACT 



TATCCTGGAC C 



CTCTGCCTGC 
GTCCAACCTC 
CAAGAGGAAA 
ACCCATGAAG 



ATTTTTATCG 
TCTTCCCTGG 
ATGQCCTTCC 
ACCAACGTCT 



CAATGAATGA 
ATACTGTCXJA 
TCTGCCTGTT 
AATTTCATTT 
GAATTGCCTA 
ACCGCTGGTT 
TGCTGGTTAT 
TGACCAAAAA 



TGACTGQACA 120 

TATTTTTTTT 180 

CCXCTTCTAC 240 

TGTATTCCTG 300 

TCTCCGTCAG 360 

CGCaSTGOAG 420 

GAGGGTGACA 480 



CCATCATCTA 
GCTTCTCTCA 
GGAGTQACAC 
AAAGCACTTC 



CTAAACrCTG C 



GGGCGGTCCC CACACTGGGC 
CCCCKATTTA CAGCAGGAGT 
TCATCATGGT TaTGGTQTAC 
TGTCTCCGCR TACAAGTGGG 
CGGK3ATGAC TGTCTTAGGO 
TCCTCGACGG CCTGAACTGC 
GCTCAACTCX: 
CATGAAGAAO 
CCCCTCCACA 
CCAAGGTGCA 
GGTGATGACT 
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MHECHYDKHM DFFYNRSNTD TVDDWTGTKL VIVLCVQTFF CLFIFFSNSI. V 
FHFPFYYLLA NIAAADFFAQ IAYVFIJ»IFtlT GPVSKTIiTVN RWFLRQGIil-D SSLTASLTNL 
LVIAVERHMS IKRMRVHSHIi TKKRVTUiIL laVHAIAIFTIQ AVPTLQHNCIi dflSACSSIiA 
FiySRSVLVF MTVSHUWFL IMWVYLRIY VYVKRKniVIt SPRTSGSISR SRTPHXLMKT 
VMTVtiSAFW CWTPGLWU. UXSLHCRQCQ VQKVKRHFLL UVLLNSWNP IIYSYKDEDM 
YGTMKXMICC FSQENPERRP SHIPSTVLSR SDTOSQYZED SISQGAVCtlK STS 

Seq ID HOt 666 DHA sequence 
Nucleic Acid Accession fti NM_002821 
Coding sequence: ISO.. 3362 



I 1_ 
AACTCCCGCC 

GCGCTCCGGT GOGTCOGCCT 
CCTCAGCrcC 
CCCGCCGGTT GCCTCTGCTC 
CCATTGTCTT CATCAAGCAG 
TTCGCTGTGA GGTTQAGGCT 
CTGTCCAGGA CACGQAG06G 
ACOSGCTGCA GOACTCTGGC 



CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC C 



TCAAOCATCC ASCCTCGGAA 
ACATTGATGG OCAOCCTOG G 
ATGGTCAGAG CAACCACACA 
GTCCTGAGCA TAGTGGGCTG 
GCAGCCAGAA CTTCACCTTG 
CCCAGGACGT GGTAGTAGCG 
AGCCACCCCC GAGCCTGCAG 



AGCGTCCTGC 
CCGTCCTCCC 
C CGGG CCCGG 
CGTTTCGCCC 
ACCTTCC3VGT 
TCCTTCAACA 



CCCACCTACC 
GTCAGCAGCA 
TATTCCTGCT 



TCATCCTGOA ROCCaCACTT 
GGGTGTTTAC AGCTGGCAGC 
AGCCCAGCGT GTGGTGGGAG 
AGAAGGGCCA CGAGCTGGTG 
GCCACGOGGC CAACCTGGCT 
TGCCCTCCTG GCTGAAGAAG 
TGGATTGCCT GACCCAGGCC 
TCATCTCAGA GGACTCAOGG 



CACCTAGCAG 



AOQOGCAAGC COGTQTCCAA 
AGCAOTGCAT GGAGTTTGAC 
AGCCCACTAT TAAGTG OGAA 
AOGCTGGGAC CCTGCATTTT 
TTGCCTCC»A CGGGCCGCAG 

TTATCAcerr caaagtggaa 

TGCAGrGCSA GGCCCAGGGG 
TCCTGGACCC CACCAASCTG C 



TTGGCX»ArA 
GGTCAGOGGA 
CCCCAAGACA 
ACACCAAAAC 
TTCGAGGTCT 
TGGTACCGTT 
GTGCTGGAAA 
AAGGAGGCCA 
CGGGCAGATG 



TGGGAGCTGC 
TGCTGCCGCT 
AGGATGCACT 
TACATGTGTA 
AGGGCAGCAG 
GTGTGGCTCG 
TCAAATGGAT 
AX3CCACAGAC 
AATGGTTCOS 
AGGAGCXXJAA 
GCGCCCACAG 
ATGAAAGCTT 
AGGCXaVTGTT 
AGGATGAGAC 
TTGCCAAOGQ 
GCATTGGCCA 
AGATTGAAGA 
TGACCTGCCT 
TCCGGCTGCC 
TTGCTGAAAG 



GCAGGGGCGC 
CTGGCTGCTC 
CCTGAGCTTT 
GGATGATGTC 



COGGCCAGAC 
ACCCAGACAG 
CGGGCGCTGC 
GATGGGGCCC 
GCAGCTGTGG 
ACTGGAGAAG 



TGAGGCAGGT CCT6TGGTCC 540 
CTTCGTTGCX! 600 
CCCCTTTCTG 660 
CGGCXSM3CTG 720 

TQCITTTGGC 
TGCCAGGGTG 
CCATTGCC3«3 
TCCCATCACT 
QTCTCTGCTG 
GGGGCAGAGG 
CATGCCGCTA 
TCCCtXCAAG 
CACCCATGGC 



GTGCTQQCAC 840 
TTCTCAQCCC 900 



GGCCCACCCA 
TTTGAGCCAC 
GGTCTGCCAG 
AGGGTCTACC 



GCCAGCTGGA 
CTACAGTTGT 
TCAAGAATGG 
GTATGAGCAG 
AGCTCAAGTT 
CGGTGCCCTG 
GGAGCAGCCT 
CTCOAOATGA 



CAACATCACT 
GGAGGGCAAA 

GACCTTGCGC 



CACACCACCA 
TTCAGCCACA 
CCX»GAOTGQ 



GTGGCCACTG 
CCCGGCTACT 
AACCAGATGC 
ATCAACAGCG 
GGCAGCATCG 
CCCCAGCCAC 
GGCCX3AGAGA 
GTGACAGACA 
•ACACTTGCA 



: GTGCGCATCT CCAOCTCACT GTGGCAOTTT 
ACAGCCCTAC 
AAGGACCQCA 
TCOCTGGTQA 
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TCCATGACGT GGCCCCTGAQ 
ACATCAAGCA CACGGAGGCC 
AC5GGCCCTGG CAGCCCTCCC 
CCGCTGTGGC CTACATCATT 
AAGCCAAGCG GCTGCAGAAG 
GAGGGCCTTT GCAGAACGGG 
GCTTGGGCrC CGGCCCCGCQ 
TCCCAOGGTC TAGCXTTGCAG 
TCCTGGCAAA GQCTCAGQGC 
GCCTGCA6AC GAAGGATGAG 
GGAAGCTGAA CCACGCCAAC 
ACTACATGGT GCTGGAATAT 
AGAGCAAGGA TGAAAAATTG 
GCACCCAGGT AGCCCTGGGC 
TGGCTGCGCG TAACTGCCTG 
TCAQCAAGGA TGTGTACAAC 
GCTGGATGTC CCXXXIAiGGCC 
CCTTOGGTGT GCTX3ATGTGG 
CAQATGATGA AGTACTGGCA 
GCTQCCCTTC CAAACTCTAT 
GGCCCTCCTT CAGTGAGATT 
GAGGAGGGAG CCCGCTCAGG 
CAGCATGATG GGCAAGATCC 
TTGCTGAG8T CTGAGCAGGG 



GACTCAGGCC G 



GCCGTGCTGG 
CAOCCCQAGG 
CAGCCCTCAG 
GCCACCAAC& 
CCCATCACCA 
TTGGAGGAGG 
CAGCAGCAGC 
GTGGTGCGGC 



GCCTCATGTT 
GCGAGGAGCC 
CAGAGATCCA 



CATTQCAGGC 
GCCTGTGCCG 
CATTGGGTTG 
CTACTGCAAG 
AGAGATGGAA 
AGAAQAAGTG 
CACAAGTGAT 
GAGTGAGTTT 



AACAGCTGCA 
GAGGAGTCGG 
TCGGTGGGTG 
AASCGCIGCA 
TQCCTCAAOG 



PCTAJS02/12476 



AASATQCACT 



TGGACTTCCG 



AAGTCACAGC 
ATGGAGCACC 
OTCAGTGCCC 



GAAGT6TTTA 
GATTTGCAGG 
CGGCTGATGC 
GCCAGCGCCC 
ATGGCCTGGG 
CTGTCCTCCT 



GAGACCTCAA 
CCCTCAGCAC 
TGTCCAACAA 
AGAGACAAGT 
ACCACTTCCG 
GTGACTTCTC 
CACATGGAGA 
CTGGGAAGGC 
AGCGCTGCTG 
TGGGAGACAG 
CRGGGGAGGA 



GTGCCGGGAG 
GCAGTTCCTG 
CAAGCAGAAG 
CCGCTTTGTG 



GATGCCCCAT 
TAGACTTCCT 
GGCCCTCAGC 
CATO3TGGAC 
CATCrCTAGA 
GTGCCCTAGT 



CTTGTGAAGA 
OAGATOTTTG 
GCTGAGCCCK 
AGGATTTCCA 
GTGGCCCTAT 
CATAAGGACT 



CCCAAGGACX 
AGCAAGCCGT 
GG6AAGCTCA 
GCAACAGGCA 



AOGCTTGGGA 
AGGGTTAATG 
ACACAGCAAG 
CCCCACCCTT 
CTTTTGACAC 



GCCATCCTTA 



TATATAAACC 
GGTGGGTGGO 
CCCCACACTT 
TTTACACTCG 



6GG0QACTAG 
GTGTGGGTGC 
AACTCTGCCA 
TTGTGQGGAa 
CCACTGGTCC 
CCACTCTGGG 
CTCATCCTAA 
GCCCTTTTTG 



GGCTTTGAGC 
CAC3^TAAC 
CTCATCTGCC 
TTCCTTAATA 



i I.LPI.U5GTQT 



IKHIEAGPW 



EANFKCQFSA 



VRLPTHGRVY 



CMSSTPAGSr 
GSSLPEWVTD 

TTVYoarrAL 

RYTCIA«arSC 
GLMPYCKKHC 
KHHSTSDKMH 
UDFRRELEMF 
PLSTKQKVAIi 
YHFHQAWVPL 
A6KARI.PQPE 



I.KHPASEAEI 
GPEHSGLYSC 
QPPPSLQWLF 
IILBATIiHLA 
QKGHELVIAN 
LDCIjTQATPK 
EAQARVQVLE 
NAGTLHFARV 
UJCEAOGDPK 
MIKHTEAPIiY 
KAKRLQXQPB 
FPRSSLQPIT 



CTTGTGCACA 
GTGCCTGGCA 
TATGCACCAC 



AIVFIKQPSS 
DRLQDSGIFQ 
HIDGHPRFTY 



PTWWYRNQM 
KLKPTPPPQP 
TRDDA<aJYXC 
PLIQHKGKDR 



CCC3UITTTCT 
AACTTTGCCT 
TTCTCAAGTT 
CTAGACCAGG 
CTGACCCAGA 
GATGAAGGAG 



QDALQGRRAL 
CVARDDVTGE 
QWFRDGTPLS 
DESPARWIA 
FANGSLIiTQ 
VTCLPPKai.P 
RQDVNITVAT 



CCCCTGCCAC 
GGCCTTCAAC 
GGGGAGGGCI 
CTGGGCACAC 
ATTATAGAGG 
CCCACGTCTT 
TTTTCaGGAG 
TATATGTAAT 



QQCMEFDKEA 
lASNGPOGQI 
ILOPTKLGFR 
EOPGSPPPYX 



CTQVAIiGMBH 
RMMSPBAILB 
GCPSKLYRLM 



LSMNRFVBKD 
(SFSTKSDVW 
QRCNALSPKD 



FLAKAQGLEB 
HYMVIiEYVSL 
LAARKCLVSA 



TVPC3ATGRB 
RAHVQI.TVAV 
MHIFQNGSIiV 
MIQTZGI.SVO 
AEIQEEVALT 
GVAETLVLVK 
GDLKQFLRIS 
QRQVKVSALG 



RPSFSEIASA I.Ca>STVDSKP 



Seq ID NO I 6£8 DNA seq>ience 
Nucleic Add Accession »> Ea 
Codtlng sequence: l..ia89 



ACCCTTOTTT 
GrTGTCAACT 
GGGTTTOCTT 
GTTTTATT6A 
AAAACTTTOQ 
ATAGCAATGA 
ATOCCAGGAG 
ACAGTTACCT 



X GCCTOTCATC C 
CTOAACATGA GTATAAAGAG 
ATCTGQTATA 
GCTTTTATTC 
GGCCCTCTCT 
QTATCTQCTC 
TATAATAGCT 
AAAOBTGTTT 



TCACTGGGTC 
ATTCAAGCGG 
TACAGTTCTC 
GTQATTTCTG 
TTCACCCAAG 
AOATTTTGTT 
GAC6TAATTG 
ACAGTOATGO 



TAAGTTACAA 
TT6ATCCTGA 
TTACTCTGCC 
CTACAGGTTT AACAACTCTG 
CACACATACC 
TCGGGGTTAT 
TAGAAGAACX: 
TATTTATCTG 
GGGACTTATT 
ATGGTGTCAC 
CCAATGTGTT 
TCATCACTGT 



CACAGTAGCT 
TATATTCTTT 
TGAAAATTAC 
TGTCATTTTG 



AAAACCTOTC 
ATAGGATTQC 
TGGGTTTCAT 
GGAACAGATA 
CTCTCTGTTC 
GGAGATACTT 
ATTGCTCQCC 
TACCGAAATA 
ATTCTT6GAA 
GACGCTTGGQ 
TTTATTTGCC 
AAGTGGTCCC 
GCTACATGTQ 
TGCAGAAATQ 
AC31TACCCTA 
AATCTTTCAT 



AGTCTGCTGC 
CTTATTCAAT 
ATGTTAOGGA 
CCTACCAGTC 
TTCAQTTTTT 
TGAGCAAAGT 
ACTTCATTAT 
TAGCAAAGCT 



TATTTGCAAA 
ACCATAACTC 
GCCTTATCCA 
GATACTTGAC 
ATGACCTGGT 
TGSAATQCTT 



TCTTATCTGA * 



A6CCAOGCTT GTGTCATT6C TGATTGATTG C 
TmTATCAV 
ATAAGATTAT 
TCATGGCTAT 



2280 
2340 
2400 

2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



34B0 
3540 
3600 
3660 
3720 
37 BO 
3840 
3900 
3960 



EARSANASFH 120 

DGQSNHTVSS 180 

PQDVWARYE 240 

VRPRNAGIYR 300 

EFSVHHEHAG 360 

VPSWLKKPQD 420 

VEVYDGTWYR 480 

KPTIKWERAD 540 

FITFKVEFER 600 

IKDVAPEOSG 660 

AAVAYIIAVL 720 

SLGSGPAATN 780 

SLQTKDEQQQ 840 



I.SKDVYHSEY 960 



RMMSPBAILB (SFSTKSDVW AFGVU1HEVF THGBMPHGGQ ADDEVIADLQ 1020 



I 

TOl 

TCTTTTTAAT 
GAAGCAAGCT 
CTTTTCCCTT 
TTTGGTCAAT 
GTATCCTTTT 



TGGACTTTCC 
TQGAAAGGTC 
AAGGGCAATT 
GCCCAATGCC 
CTTCTTAQTT 
TATGTCCATC 
ATTTACTGGC 
AACATTTGOA 
TGTGACAAQA 
CM 



r TTTGGATTCG 



TACAAATACT 



1020 
1080 
1140 
1200 
1260 



441 



wo 02/086443 

CAAGRCTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 
AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 

Seq ID NOs 669 Protein sequence 



PCT/US02/12476 



lU MGYQEQEPVI PPQRDU3DRE TLVSEHEYKE 
GFPIjGII.LI.F KVSYVTDFSL VLLIKGGALS 
lAMISYNIIA GDTIiSKVFQR IPGVDPENVF 
SIiISTQLTTL ILGIVMARAI SLGPHIFKTE 
YSaUEEPTVA KWSRLIHMSI VISVFICIPF 
15 RFCYGVTVIL TYPitBCPVTR EVIANVFPGG 
L TPLIFIIPSA CYLXIiSBEPR 
yCFPDMFSLT HTSBSHVQQT 



31 
I 

KTCQSAALFM 
GTDTYQSLVN 
IGRHFIIGLS 
DAWVFAKPHA 
ATCSSYLTFTG 
MLSSVPHIW 



TQLSTWISI FQ 



41 
I 

WNSIIGSGI 
KTFGFPGYLL 
TVTPTLPLSli 
IQAVGVMSFA 
FTQGDLFENY 
TVMVITVATL 
MLFIGAWKV 



IGLPYSMKQA 
LSVLQFLYPF 
YRNIAKLGKV 
FICHHNSFI.V 
CaiNDDLVTFG 
VSLLIDCLGI 



Seq ID HO: 670 MIA sequence 
Nucleic Acid Accession tts Eos sequence 
.1284 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ATQGGCTACC 
AAGCAA6CIG 
TTTTGCCTTG 
TTGG TCAA TA 
TATCCTTTTA 
TTTCAAAGAA 
GGACTTTCCA 
GGAAAGSTCT 
AGGGCAATTT 



I 



ATGTCCATCG 
TTTACTGGCT 
ACATTTGGAA 



AQAGOCAOGA GCCTGTCATC 
GQTTTCCTTT 
TTTTATTGAT 
AAACTTTOGQ 
TAGCAATGAT 
TCCCAGGAGT 
CACTTACCTT 
CCCTCATCTC 
CACTQGGTCC 
TTCAAGCGGT 
ACAGTTCTCT 
TGATTTCTGT 
TCACCCAAGG 



CGGCC6CAGA 
CTTTTATTCT 
GCCCTCTCTG 
TATCPQCTCC 



I I 

GAGGATTGCC TTATTCAATG 60 

GGGTTTCATA TGTTACAGAC 120 

GAACAGATAC CTACCAGTCT IBO 

TCTCTGTTCT TCAGTTTTTG 240 



AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 
TGATCCTGAA AAajTGTTTA 
TACTCTGCCT TTATCCTTGT 
TACAGGTTTA ACAACTCTGA 
ACACATACCA A 



ACAAATACTC 
TCTCTCACAA 
ATTAGTATCT 



GA' 

AGGTAATTGC 
CAGTGATGGT 
TTCTAGAACT 
GTTATCTGAA 
TGCTTCCCAT 
AAGACTGCAC 
ATACCTCAGA 
TTCAACTCGA 



AGAAGAACCC 
ATTTATCTGT 
GGACTTATTT 
TGGTGTCACT 
CAATGTGTTT 
CATCACTGTA 
CAATGGTGTG 
ACreXCTGAA 
TGGTGCTGTG 



TCTTTTGCAT 
ACA6TAQCTA 
ATATTCTTTQ 
QAAAATTACT 
QTCATTTTGA 
TTTGGTGGGA 
GCCACGCTTG 
CTCTGTECAA 



ACOSAAATAT 
TTCTTGGAAT 
ACGCTTGGGT 
TTATTTGCCA 
AGTGGTCCCG 



GCAGAAATGA 
CATACCCTAT 
ATCTTTCATC 
TGTCATTGCT 
CTCCCCTCAT 
CACACTCCGA 
XTGGATTCGT 
ACTGCTTPCC 
CACAACTTTC 



CTTCATTATT 
AGCAAAGCTT 
TGTAATGGCA 
ATTTGCAAAG 
CCATAACTCC 
CCTTATCCAT 
ATACTTGACA 
TGACCTGGTA 
GGAATGCTTT 

GGTrrrccAC 

GATTGATTGC 
TTTTATCATT 
TAAQATTATQ 
CRTGGCTATT 
TQ ACAA TTTC 
TACTTTAAAT 



11 



21 



31 



51 



ATGGGCTACC 
AAAGGAGGGG 
TTTCCAGGGT 
AGTTACAATA 
GArOCTGAAA 
ACTCTGCCTT 
ACAGGXTTAA 
CA CATA CCAA. 
GGGQTTATGT 



AQAGGCAGQA 

cccTCTcroa 

ATCTGCTCCT 
TAATAGCTOa 
AOGTGTTTAT 
TATCXOTGTA 
CAACTCTGAT 
AAACAQAAGA 
CTTTTGCATT 
CAGTAGCTAA 
TATTCTTTGC 
AAAATTACTG 
TCATTTTOAC 
TTGGTGGGAA 



31 

I 

GCCTGTCATC 
AACAGATACC 
CTCTGTTCTT 
AQATACTTTO 
TQGTCGCCAC 
COGAAATATA 



TA CCAOTC TT 
CAGTTTTTGT 
AGCAAAGTTT 
TTCATTATTG 



OQCTTGGGTA 



AATGGIGTGC TCTGTGCAAC 



OT6GTCCCGC 
TACATGTGGA 
CAGAAATGAT 
ATACCCTATG 
TCTTTCATCG 
GTCATTGCTG 



TCTCATGTTC 



TQATOQTTTT 
AAATGTTCTA 
AGCAGACAAC 



TGGATTOGTC 
CroCTTTCCT 
ACAACTTTCT 



GTAATGGCAA 
TTTGCAAAGC 
CATAACICCT 
CTTATCCATA 
TACTTGACAT 
GACCTGGTAA 
GAATGCTTTG 
GTTTTCCACA 
ATTGATTGCC 
TTTATCATTC 
AAGATTATGT 
ATGGCTATTA 
GACAATTTCT 
ACTTTAAAXA 



TGOTCAATAA 
ATCCTTTTAT 
TTCAAAGAAT 
GACTTTCCAC 
GAAAGGTCTC 
GGGCAATTTC 
CXaVAT GCCA T 
TCTTAGTTTA 
TGTCCATCGT 
TTACTGGCTT 
CATTTGGAAG 
TGACAAGAGA 



TCGGGATAGT 
CATCAGCCTG 
CTTGTGTCAT 
CAAATACTC31 
CTCTCACAAA 
TTAGTATCTT 



1020 
1080 
1140 
1200 
1260 



I I I 

MQYQRQEPVI PPQRGLPYSM KQAGFPLGIIi LLPWVSYVTD PSLVLLIKGG ALSGTDTYQS 
LVNKTFGFPG YLLLSVLQPI. YPFIAMISYN IIAGDTLSKV FORIPGVDPE NVFIGRHFII 
GLSTVTPTLP tiSLYRMIAKL GJCVStlSTGI. TTIiILGIVMA RAISLGPHIP KTEDAWVFAK 
PNAIQAVGVM SFAFICHHMS FLVYSSI.EEP TVAKNSRLIH MSIVISVFIC IFPATCGYLT 
FTGFTQaDLF ENYCRHSDLiV TPGRFCYQVT VII.TYPKECF VTREVIANVF FGGHIfSVFB 
IWTVMVITV ATIATSLIiIDC U3IVI.EUIGV I.CATPLIFII PSACYLKLSE EPRTKSDKIM 
SCVMLPIGAV VMVFCFVMAI TMTQDCTHGQ BMFTfCFPDNF SLTHTSBSHV QOTTQLSTUr 
ISIFOLB 

Seq ID NO I 672 DNA sequence 
Nucleic Acid Accession «: 1 
Coding sequence: 1..1203 



SI 
I 

TTTATT6ATA 
AACTTTCGGC 
AGCAATGATA 
CCCAGGAGTT 
AGTTACCTTT 
CCTCATCTCT 



TCAAQCGQTC 
CAGTTCTCTA 
GATTTCTGTA 
CACCCAAGGG 
ATTTTGTTAT 
GGTAATTGCC 
AGTGATGGTC 
TCTAGAACTC 
TTATCIGAAA 
GCTTCOCATT 
AGACTGCACC 
TACCTCAGAG 
TCAACTCGAG 



1020 
1080 
1140 
1200 



442 



PCT/US02/12476 



1 11 21 31 41 51 

I I I I I I 

MGVQRQEPVI PPQPSLVIAI KGGAIiSCTOT YQSLVNKTTO CFOYUJiSVIi QPIiYPFIAMI 
SVNilAGDTL SKVFQRIPGV DPENVFIGRB FIIGLSTVTF TLFLSLYRNI AKLGICVSIiIS 
10 TCLTTLILGI VMARAISU3P HIPKTE3JAWV FAKPNAIQAV GVMSFAFICH IOISFI.VYSSL 
EEPTVAKWSR LIHMSIVISV PICIFFATCG YLTFTGFTQG DLFEMYCStllD DLVTFGHPCY 
GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATI>VSLI< IDCUJIVIiEL 
MGVliCATPLX FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MMTMTQDCT 
HGQEMFYCFP DKFSLTKTSE SHV1QQTTQI.S TUTISIPQIiB 

Seq ID HOs 674 I»)A sequence 

Kuclelc Acid Accession »: Eos seq^ience 

Coding sequence! 1..1140 

20 1 11 21 31 41 SI 

I I i I I i 

ATGGGCTACC AGAGGCAGGA QCCTGTCATC CCGCCGCAGG TCAATAAAAC TrTCGOCTTT 
CCSU3GGTAIC TGCTCCTCTC TQTTCTTCAG TTTTTGTATC CTTTTATAGC AATG ATAA GT 
TACAATATAA TAOCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 
25 CCTQAAAAOG TGTTTATK3Q TOGCaurTTC ATTATTGQAC TTTCCRCAGT TACCTTTACT 
CTGCCTTTAT CCTTGTAC06 AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 
• GGTTTAACAA CTCTGATTCT TGGAATTGTA ATCGCAAGGG CAATTTCACT GGGTCCaCAC 
ATACCaVAAAA CAQAAQACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCXSGa 
GTTATGTCTT TTQCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 
30 GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCXJTGAT TTCTGTATTT 
ATCTCTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCRC CCAAGGGGAC 
TTATTTQAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 
QTCACTCTCA TTTTGACATA CCCTATGQAA TQCTTTGTGA CAAGAGAGGT AATTGCCAAT 
GTGTTTTTTG CTQGGAATCT TTCATCQOTT TTCCACATTG TTGTAACAGT GATGGTCATC 
35 ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTQCCTCG GGATAGTTCT AGAACTCAAT 
GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CASCCIX3TTA TCTGAAACTG 
TCIGAAGAAC CAAGGACACA CTCCQATAAO ATTATeTCTT 6TGTCATGCT TCCCATTGGT 
TGaTTTTTGG ATTCQTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 
TOTTCTACTG CTTTCCTOAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 
AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCQAGTAA 



45 1 11 21 31 41 SI 

I I I I I I 

MGYQRQEPVl PPQVNKTFGF PGYIiLLSVLQ FtYPPIAMIS YMIIAGDTLS KVFQRIEGVD 
PENVFIGRHF IIGLSTVTFT LPIiSLYRNIA KU3SCV3LIST GLTTLIIjGIV MARAISIiGPH 
IPKTEDAWVF AKPNAIQAVG VMSFAFICHH NSFI.VYSSLE EPTVAKWSRI. IHMSIVISVP 

50 ICIFFATCGY LTFTGFTQGD LFEHYCRNDD I<VTFGRFCYG VTVILTYPMB CPVTREVIAM 
VFFGGNLSSV FHIWTVMVI TVATLVSIiliI DCI.aiVI.BLN OVltflATPLlP IIPSACYXJO. 
SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQBMFYCFPD NPSLTMTSES 
KVgQTTQLST LMISIFQIjB 

55 Seq ID NO: 676 DNA sequence 

Nucleic Acid Accession tti NM_006853.1 
Coding sequence: 2 6.. 874 

1 11 21 31 41 51 

60 I I I I I I 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGQAAGTC 
ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 
CAIGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 
CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGQ CAGCCCTGTT 
65 CGAGAAGACQ CGGCTACTCT QTGGGGCQAC GCTCATOGCC CCCAGATGGC TCCTGACAGC 



GGAQGGCTQT GAGCAGACCC GOACAGCCAC TOAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTTCCC AACAAAQACC ACCGCAATGA CATCATGCTQ QTGAAGATGO CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA OGCTGTGTCA CTGCTGGCAC S40 

70 CAGCTGCCTC ATTTCCGQCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCQATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACQ CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAOG ACTCCTGCCA 720 

GGOTGACTOC GQGGGOCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGO 780 

CCAGGATCCQ TQTGCQATCA CCCQAAAQCC TGGTGTCTAC ACGAAAOTCT GCAAATATGT 840 

75 GOACTGGATC CAGGASACGA TOAAGAACAA TTAQACIGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACrCT GTTAATAAGA AACCCTAAOC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGSACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTO GGGTTCGAAA TCAGTOAGAC CTGGATTCAA ATTCTGCCTT OAAATATTOT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 

80 TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTOCTAAA TGAGTG 



85 



443 



wo 02/086443 

AHCLKPRVIV HLOQHNLQKE EGCEQTRTAT ESPPHPGFNN SLPNKDHRND IMI,\TOIASPV 
«TOA^£t isS^Cvi^ SCLISGWGST SSKH^PHT LRCAKITIIE HQI^MPG 
NITDTMVCAS VQEGGKDSCQ GDSGQPI.VC1I QSIiQGIISWO QDSCMTRKP GVYTKVCKYV 
DWIQBTMKHH 

Seq ID NOi 678 DHA sequence 

Nucleic Acid Accession >: Eos sequence 

Coding sequence: 1..933 

1 11 21 31 41 51 

iTGTGCACSCA LcSGACGGTG OVTCCCGGGC GCCTGGCAGT GTGACGGGCT GCCTGACTGC 
TTCGACAAQA GTGATCAGAA GGAGTGCCCX: AAGGCTAAC3T CGAAATGTGG CCravCCTTC 
???^SS?G CCAGCX3GCAT CCATTGCATC ATTGGTCGCT TCOQGTGCAA ^GGOTTTGAG 
GACTGTCCOG ATQGCASCGA TGAAGAGAAC TGCSVCAGCftA ACCCTCTQCT TTGCTCXaVCC 
ACT^oS CGGCCrCTCT ATTOACAAGA GCTTCATCTG CGATCGACAG 
SS^TC ScAACAG TCATGAGGAA AGCTOTQAAA GTTCTCAAGA ACCCGGCAGT 
^^^^Stgt ??5tgaStc AGAGAACCAA CTTGTGTATT ACCCCAGCAT cacctatocc 
ATCATCGGCaV GCTCCGTCAT TTTTGTGCTG GTGGTGGCCX: TGCTGGCACT GGTCTTGCAC 
^SgCOTA AGOSGAACAA CCTCATGACX5 CTCCCCXTTOC ACCGGCTGC3V GOVCCCTGTG 
S^S^ GCCTGGTGGT CCTGGACCAC CCCX3.CCACT GCAACGTCAC CTACAAOGTC 

SSaatSgca tccagtatct ggccagccag goggagcaoa atgcgtogga agtaggctcc 

ACTCCGAGGC CTTGCTGGAC CAG^GGCCTG CGTOGT^ 

ccgccctact cttctgacac ggaatctctg aaccaaoccg aoctgccccc ct^ogctcc 

CGGTCCGGOA GTGCCAACAG TQCCAOCTCC CRGGCAGCCR GCAGCCTCCT OMOTTOGAA 

^cSgcc acagcccgoq gcagcctggc.ccccagcsagg gcactgctga gcccaggqac 
tctgagccca. gccagggcac tcaaoaagta taa 

Seq lO NOs . 679 Protein sequence 
Protein Accession «: Eos sequence 

1 11 21 31 41 51 

LsMGHClPG AWQCDGLPDC PDKSDEKECP KAKSKCGPTF FPCASGIHCI IGRFRCNGFE 
SSS^S^ CT^^ ARYHCKHQI^ IDKSFICDGQ NNCQDNSDEE SCE3SQEPGS 
nXvPUTSmO LVTfYPSITYA IIGSSVIFVI. WALIALVLH HQRKHNKLMT LPVHRIQHPV 

SS^S SIJgiqwasq abqnasevgs ppsysealld 

PPYSSDTBSL NQADr-PPraa RSGSANSASS QAASSLLSVE DTSHSPGQPG PQBGTABPHD 



51 



seq ID NO! 680 DMA sequence 
Nucleic Acid Accession »: 378203.1 
Coding sequence-. 1..2190 

1 11 21 31 4. - 

ATGAATCCTT IcCAGAAAAA TGAGTCCAAG GAAACTCrTT •rTTCAOCTGT CT^TT^ 60 

SI^ACCAC CTCGACCACC TAGCCCTCCA AAGAAGCCAT CTCOGACAM «0 

I^iATCCAC TGAGCATTGC CTTCATTGTG GTGAATOftAT TCTGOGAGCQ CTTTTCCTAT 180 

ISiGCT GATCCTGTAT TTCCTGTATT TCCTGCACTG GAA^OA^ 2" 

iS^^ACAT CTATATACCA TGCCTTCAGC AGCCTCIGTT ATTTTACTCC CATCCTGGGA 300 

SSgcSJxG CTGACICGTG GTTCGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTG 3S0 

TmSgCTTO GCCATOTGAT CAAGTCCTTG GGTGCCTTAC CAATACTGGG AGGACAAGTG 420 

ctSS^ gatcggcctg agtctaatab ctttggg^ til 

AAATCCTGTG TGGCAGCTTT TGGTOaAaAC CAGTTTGAAG AAAAACATOC «»GGAACGG 540 

ACTMMACT TCTCAGTCTT CTACCTGTCC ATCAATGCAQ GGAGCTTCAT TTCTACATTT 600 

A^ScAC^ TCCTCAOAGG AGATGTGCAA TGTTTTGGAG RAGACTeCXA TGCATTGGCT 660 

??5^GTTC SgScTGCT CATGGTAATT GCACTTGTTG TCTTTGCAAT GGGAAGCAAA 720 

A?IS^S AACCACCCCC TGAAGGAAAC ATAGTGGCTC AAGTTTTCAA ATGTATCTGG 780 

^TTOTATTT CCAATCGTTr CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAGCACTGG 840 

Sa^ACTGGG CAGCTGAGAA ATATOCAAAG CAGCTCATTA TGGATGTAAA GGCRCTGACC 900 

^TACTAT TCCTTTMAT CCCATTGCCC ATCTTCTCGG CTCT TTTGGA TCAfiCAGGGT 960 

ctSgCAAGC CATCAGGATG AATASGAATT TGQGCTTTTT TGTGCTTCAG 1020 

TGCaStTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC GTTGTTTGAC 1080 

^^^^ l^^^ CTCCAMTGT GGAATTAACT TCTCATCACT TAGGAAAAT6 1140 

^^C^ CT^CTGGCA TITGCAGTTO OQGCftCCrGT «^ 1200 

a5S^ ?^^?^C ^SSSgS aXX3UK5AGG TTTTCC^^^ 12|0 

i^CMMG ATGAGGTGAA GGTGACAGTG GTGGGAAATG AAAACAATTC TCTGTrOATA 1320 

S^?^^ ^^?^??T^ GAAAACACCA CACTATTCCA AACn^OT ^AA^^ 1380 

iS^SSr TTCACTTCCA CCIGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

Si^CAG^ AGAACTGGTA CAGTCTTGTC ATTCGTGAAG ATGGGAACAG TATCTCCMC 1500 

A?^S^ AGQATACMA AAGCAAAACA AGCAATO©^ 15" 

Ai^AfTOGC ATAAAGATOT CAACATCTCC CTGAGTACAG ATACCTCTCT CAATGTTGGT 1620 

S^SS^ G?S?^^ S^SAACT GTGCAAAOAO GA^^ "80 

TOTOGAACAG AAGATAAGAA CTTTTCTCTG AATTTGGGTC TTCTAGACTT TGGTGCAGCA 1740 

SaSS CAGGOTCTTC AGGCCTG^ "00 

ATTCCAGcS ACAAAATGTC CATTGCGTGG CAGCTACCAC AATATGCCCT ^TTACRGCT 1860 

^l^rSlI™ TY^T^TTTf-rGT CACAGGTCTT GAGTTTTCTT ATTCTCAGGC TCCCTCTAGC 1920 

SSSj TGGCCTGGTA CAGTGGGCCG AATTCATTT^ "40 

OTCCTGCTGG TGATCTGCCT GATCTTCTCC ATCATGGGCT ACTACTATOT TCCTQTAAAe 2100 

TC^SgTCC AGCAGATAAG CACATTCCTC ACATCCAQO^ 2"0 
AAACTAGAOA CCAAGAAGAC AAAACTCTGA 
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HNPPQKNESK ETLPSPVSIE EVPPRPPSPP KKPSPTICX3S MYPLSIAFIV VNEFCERPSY SO 
YGMKAVLII.V FLYPLHWNED TSTSIYHAFS SLCYFTPILQ AAIADSWIjGK FKXHYI.S1.V 120 



lYNKPPPEGN IVAQVFKCIW FAISNRFKNR SGDIPKRQHH LDWWVBKYPK QI.IMDVKALT 



LWAQFSGLV QWAEFILFSC LLLVICLIFS IKGYYYVPVK TEDMRGPADK HIPKIQGWJI 



Seq ID NO I 6S2 DHA sequence 

Nucleic Acid Accession #! NM_016077.5 

Coding sequence: 128.. 667 
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TC3GCTTTGTG ATTCTTGATC CGGAACTTTQ TCACCCAGGA ACCCC3GGAAG AGGTAGCTCA 
CGCC3ATAGAA ACXSTGTTOGC TTGCCCAGAA GAAGGGAAGG GGOGftSTGAQ GAAAGGRGGT 

25 ACTGTAGATG CCCTCCAAAT CCTTQGTTAT GGAATATTTG GCTCATCCCA GTACACTOQG 
CTTGGCTGTT GGACTTGCTT GTGGCATGTG CCTGGGCTGG AGCCTTCOAO TATGCTTTGQ 
GATGCTCCCC AAAAGCAAGA OGAGCAAGAC ACACACAQAT ACTQAAAGTG AAGCAAGCAT 
CTTGGGAGAC AGCGGGGAGT ACAAGATGAT TCTTGTGGTT OGAAATGACT TAAAGATGGG 
AAAAGGGAAA GTGGCTGCCC AGTGCTCTCA TGCTGCTGTT TCAGCCTACA AGCAGATTCA 

30 AAGAAOAAAT CCTGAAATGC TCAAACayVTQ GGAATACTGT GGCCAGCCCA AGGTGGTGGT 
CAAAGCTCCT GATGAAGAAA CCCTQATTGC ATTATTGOCC CATGCAAAAA TGCTGGGACT 
QACTGTAAGT TTAATTCAAG ATGCTGQACG TACTCAGATT GCACCAOGCT CTCAAACTOT 
OCTAGGQATT GGGCCAGGAC CAGCAGACCT AATTGACAAA GTCACTOOTC ACCTAAAACT 
TTACTAlGGTG GACTTTGATA TGACAACAAC CCCTCCATCA CAAaTGTTTa AAGCCTQTCA 

35 GATTCIAACA ACAAAAGCTQ AATTTCTTCA CCCAACtTAA ATQTTCTTXa GATGAARATA 
AAACCTATTC CX»TGTTCTA AAAAAA 
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MPSKSLVMEY LAHPSTLGUV VGVAOTICKS WgUlVCFGMI. PKaKTSKTHT DTB3BASIIiG 
OSGEYKMILV VRNDLKMCKO KVAAQCSHAA VSAYKQIQRR HPEMLKOMBY CGQPKWVKA 
45 POBBTI.IALL AKAXMI/3LTV SLIQDASRTQ IAF6SQTVI.G IGPGPADLID KVTOUjKLY 

a NO I 684 DHA sequence 
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III III 

OGGAACGAGG GCAACCTGCA. CAGCCATGCC CGGGCAAGAA CTCAGGACGG TGAATGGCTC 60 

TCAGATQCTC CT G OTGTTGC TGGTGCTCTC GTGGCTGCCG CWTCGGOGCG CCCTQTCTCT 120 

55 GGCCX3AGGCQ AGCCXKX3CAA GTTTCCCGGG ACCCTCAGAQ TTGCACTCCO AAGACTCCAG 180 

ATTCCGAGAG TTGCGGAAAC GCTACX3AGGA CCTGCTAACC AGQCZGOGGG CCAACCAGAG 240 

CTGGGAAGAT TCGAACACCG ACXTTCGTCCC GGCCCCTGCX 6TCCX3GATAC TCACGCCBGA 300 

AGTGCGGCTG GGATCCGGCG GCCACCTGCA CCTGCXSTATC TCXOGGGCOG COCTTCCXMA 360 

GGGGCTCCCX: GAGGCCTCCC GCCTTCACCG GGCTCTGTTC CGGCTGTCCC COACGGOQTC 420 

60 AAGGTCGTGG GACX3TGACAC GACXKCTGCG GCGTCAGCTC AGCCTTGCAA GACCTCAAGC 480 

GCCCGCGCTG CACCTGCGAC TGTCGCCX3CC GCCGTCGCSU3 TCGGAOCAAC TGCTGGCAGA 540 

ATCTTCGTCX: GCACGGCCCC AGCTGGAGTT GCACTTGCGG CXXSCAAOCCG CCAGGGGGCQ 600 

CCX3C3U3AGCX3 CXyTGCQCGCA ACGGGGACGA CIQTCCGCTC GOGCCGGGGC GTTGCTGCCG 660 

TCTGCACACG GTCaXHCXTT CGCTGaAAaA CCTGQGCTGQ GCCX3ATTGGG TQCTGTOOCC 720 

65 ACGGGAGGTG CRAGTCACCA TGrGCATCQG CGCGTOCCOG AGCCAGTTCC GGGOGGCAAA 780 

CATGCACGCG CAGATCAAGA CGAGCCTGCA CCGCCTGAAG CCCGACACGG AGCCAGCSCC B40 

CTGCTGCGTG CCCGCC3U3CT ACAATCCCAT GOTGCTCKIT CAAAAGACOS ACACCGGGOT 900 

GTCX3CTCCAG ACOTATGATG ACTTGTTAGC CAAAGACTGC CACTGCSlTAT G AflCAGT CCT d60 

_^ GGTCCTTCC3V CTGTGCACCT GCaCaGGGGA GGOSACCTCA GTTGTOCTQC CX:iGiaGAAT 1020 

70 GGGCTCAAGG TTCXTOAGRC ACCCGATTCC TGCCCAAACA GCTOTATTTA TATAAGTCTG 1080 

TTATTTATTA TTAATTTATT GGGGTGACCT TCTTGOGGAC TCGG<33GCTO GTCTGATGGA H40 

ACTGTGTATT •TATTTAAAAC TCTaOTGATA AAAATAAAGC TQTCTGAACT GTTAAAAAAA X200 
AAAA 

75 

Protein Accession •< 

1 11 21 31 41 51 

80 MPGQELRTVN GSQMLI.VU.V LSWI.PHGGAL SIiAEASRASF PGPSEUISED SRFRBLHKRY 
EOU^TRLRAN QSWEDSNTOL VPAPAVRILT PEVRUSSGGH LRLRISRAAI. PEGLPBASRIi 
HRALFRLSPT ASRSHDVTRP LRRai.SI<ARF QAPAUILRLS PPPSQSDQLL ABSSSARPQIj 
EUHUtPQAAR GRBHARAIINQ ODCPLaPORC CRUITVRASL BDLQKAOWVL SPREVQVTKC 
IGACPSQPRA ANMHAQIKT8 UOIUCPDTBP APCCVPA9YN PMVLIQKTDT QVSLQTYDDL 

85 LARDCHCI 

Seq ID nOt 686 OtIA sequence 
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ACCAAATCAA CCATAGGTCC AAGAAC3kATT GTCTCTGGAC GGCAGCTATQ CGACTCACCQ 
TGCTGTGTGC TGTGTGCCTG CTGCCTGGCA GCCTGGCCXrr GCCGCTGCCT CAGGAGGCGG 
GAGGCATQAQ TGAGCTACAQ TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTTATCTCT 
, _ ATGACTCAGA AACAAAAAAT GCCAACABTT TAOAAGCCAA ACTCAAGGAO ATGCAAAAAT 
10 TCTTTOGCCT ACCTATAACT GGAATGTTAA ACTCXXGCX3T CATAOAAATA ATOCAGAASC 
CCAGATGTGG AGTOCCAOAT GTTGCAOAAT ACTCACTATT T0CAAATA6C CCAAAATGGA 
CTTOCAAAGT GGTCACCTAC AGGATOGTAT CATATACTOG ASACTTACOS CATATTACAG 
TGQATCQATT AOTGTCAAAG GCTTTAAACA TOTGGGGCAA AOAGATCCCC CTGCATTTCA 
GGAAAGTTGT ATGGGGAACT GCTGACATCA TQATTGGCTT TGOSCSAGGA GCTCATGGGO 
15 ACTCCTACCC ATTTGATGGQ CCAGGAAACA CGCTGGCTCA TGCCTTTGOG CCTGGGACAG 

GTCTCGGAGG AGATGCTCAC TTCGATGAGG ATGAACX3CTG GACGGATGGT BOCAGTCTAG 
GGATTAACTT CCTGTATGCT GCAACTCATG AACTTGGCCA TTCTTTGGGT ATGGGACATT 
CCTCTGATCC TAATQCAGTG ATGTATCCAA CCTATGOAAA TQGAGATCCC CAAAATTTTA 
_ - AACTTTCCCA GGATGATATT AAAGGCATTC AGAAACTATA TQaAAAOAiSA AOTAAITCAA 
20 GAAAGAAATA GAAACTTCAO GCAGAACATC CATTCATTCA TTCATTGGAT TGTATATCAT 
TGTTGC»CAA TCAOAATTOA TAAGCACTGT TCCTCCACTC CATTTACCAA TTATGTCACC 
CTTTTTTATT GCAGTTGGTT TTTGAATGTC TTTCACTCCT TTTATTGQTT AAACTCCTTT 
ATGGTGTGAC TGTGTCTTAT TCCATCTATG AGCTTTQTCA GTGGGGGTAG ATGTCAATAA 
ATGTTACATA CACAAATAAA TAAAATGTTT ATTCCATCQT AAATTTA 



1 11 21 31 41 SI 

I I I I I I 

MRl,TVtCAVC U.PGSLALPI. PQEAGGMSEL QWEQAQDYUC RPYIiYDSETK NANSLEAKLK 
EMQKFFai.PI TGMLHSRVIE IMQKPRCGVP DVAEYSIf PK SPXWTSKWT YRIVSYTHDL 
PHITVDRLVS KAIJIMWaKEI PliHFRKWMG TADIMIGFAR GAHGDSYPPD OPOHTLAHAP 
APGTGLGGDA HFDEajERWTD GSSLGIMFLY AATKEMHSU GMGHSSDPtTA VMYPTYraiGO 
POMFKI>SQDD IKGIQKIjyOK RSNSRKK 



Coding sequence: 1..B70 
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ATGACAGGAG TGTTTCACAG AAGG6TCCCC AGCATCCGAT COGGCGACTT CCAAGCTCCG SO 

TTCCAGACGT CCGCAGCTAT GCACCATCCO TCTCAGGAAT OJCXSkACTTT GCXXXJAQTCT 120 

TCAGCTACCG ATTCTGACTA CTACAQCCCT ACGGGGGGAG CCCCGCACXSG CTACTGCTCT 180 

CCTACCTCQG CTTCCTATGG CAAAGCTCTC AACCCCTACC AOTATCAGTA TCACGGCGTG 240 

AACGGCTCCG CXX3GGAGCTA CCCAGCCAAA GCTTATGCCG ACTATAGCTA CGCTAGCTCC 300 

TACC»CCAGT ACX3GCGGCQC CTACAACCGC GTCCCAAGCG CCACCAACCA GCCAGAGAAA 360 

GAAGTSACCG AGCCCGAGGT GAGAATOGTQ AATOGCAAAC CAAAGAAAGT TOGTAAACCC 420 

AGGACTATTT ATTCCAGCTT TCaWSCTGGCC GCATTAGAOA 6AAGGTTTCA GAAOACTCAG 480 

XACCXCBCCX IGCCGGAACO CGCCGAGCT6 GCCGCCTCGC TB6GATTQAC ACAAACACAG 540 

GTGAAAATCr GGTTTCAGAA CAAAAQATCC AAGATCAAOA AQATCATGAA AAACGQQQAa 600 

ATGCCOCCGG AGCACAGTCC CAGCTCCAGC GACCCAATGG CQTGTAACTC GCCGCAGICT 660 

CCAGGGOrer gggagcccca gggctogtcc cgctcxjctca gccaccaccc tcatgcccac 

CCTCGGACCT CCAACCAGTC CCCAGCGTCC AaCTACCTGG * " 
ACAAOTQCAG CCAGCTCAAT CAATTCCCAC CTQCCGCCGC CQGGCTCCTT f 
CrGGOaCTGO CCTCGGGOAC ACTCTATTAG 



1 11 21 31 41 51 

I i i I I I 

MXGVFURRVP SIRSGDFQAP FXJTSAAMKHP SQESPTUES SATDSDYYSP TGGAPHGYCS 
PTSASYGKAI. NPyQYQYHGV NGSAGSYPAK AYADYSYASS YHQYGGAYNR VPSAIWQPEK 
EVTBPEVRMV KGXPKKVRKP RTIYSSFQLA ALQRKFQKTQ YLALPBHAEIj AASLGLTQTQ 
VKIWFQNKHS KIKKIMKNOE MPPEHSPSSS DPMAOISPQS PAVWEPQGSS RSLSHHPHAH 
PPTSRQSPAS SYLENSASWY TSAASSIMSH LPPPGSLQBP LAIiASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
ncoiporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 1. A method ofdetecting a lung cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables I A- 1 6 . 

1 2. The method of claim 1 , wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1 A-16. 

1 3 . The method of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are miRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1 A-16. 

1 8. Hie method of claim 1, wherein the polynucleotide is labeled. 

1 9. The method ofclaim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid siuface. 

1 11. The method of claim 1, wherein the patient is undergoing a tiierapeutic 

2 regimen to treat lung cancer. 

1 12. The method ofclaim 1, wherein the patient is suspected of having lung 

2 cancer. 



1 

2 



13. A method of monitoring the efQcacy of a fher^eutic treatment of lung 
cancer, the method comprising the steps of: 
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3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated transcript m the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1 A-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method ofclaim 13, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. ... 

1 15. The method ofclaim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient xmdergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypeptide encoded by a polynucleotide 

7 that selectively hybridizes to a sequrace at least 80% identical to a sequence as shown in 

8 Tables 1 A-16, wherein tiie polypeptide specifically binds to the lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 16, wherein the patient is a human. 

1 1 9. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient xmdergoing the therapeutic 

4 treatment; and 
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5 (ii) determining the level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables lA-16, thereby 

9 monitoring the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated polyp^tide to a level of the limg cancer-associated 

3 polypeptide in a biological sample firom the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 21. The method of claim 19, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1 A-1 6. 

1 23. The nucleic acid molecule of claim 22, which is labeled. 

1 24. The nucleic acid ofclaim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1 A-1 6. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29, The antibody ofclaim 28, further conjugated to an effector component. 

1 30. The antibody ofclaim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody ofclaim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody Segment 
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1 33 . The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a lung cancer cell in a biological sample &om a 

2 patient, the method comprising contacting the biological sample with an antibody of claim 

3 28. 

1 35 . The method of claim 34, wherein the antibody is fiirther conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 the method comprising contacting a biological sample from the patient with a polypq)tide 

3 encoded by a nucleic acid comprises a sequence from Tidsles lA-16. 

1 38. A method for identifying a compound that modulates a lung cancer- 

2 associated polypq)tide, the method comprising the stq>s of: 

3 (i) contacting the compoimd with a lung cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables lA-16; and 

6 (ii) detOTmning the frmctional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect 

1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41. The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determmed by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant 
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1 44. A method of inhibiting proliferation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method comprising the step of administering to the subject a 

3 therapeutically effective amount of a compound identified using the method of claim 38. 

I 45 . The method of claim 44, wherein the compound is an antibody, 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having limg cancer or a cell 

3 isolated thereficom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables lA-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claun 47, wherein the control is a mammal with lung 

2 cancer or a cell therefix>m that has not been treated with the test compound. 



1 49. The assay ofclaim 47, wherein the control is a normal cell or mammal. 

1 50. A method for treating a mammal having lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compoimd identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 
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meanhigftd search from being carried out: 

i I die descriptnn |^ die claims - tiie drawings 

The faihue oTdw nucleotide and/or anrino acid sequence listing to comply widi die standard provided for in Annex C 
of the Admiidstradve Ihsttuetions prevents a meaningfid search from brang carried out: 

LJ diewrittenfonnhasnotbeenfiindshedor does not comply with the standard. 

tS the computer readsOjtefonn has not been finilshed or does not complb^ with die standard. 
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THE INTERNATIONAL SEARCH REPORT 
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fdS/S^^^) 15 AUG zoos 



Applicant's or agent's file teCeteacc 
18501-15-3PC 



FOR FURTHER ACTION See paiagr^bs 1 and 4 below 



PCT/USM/12476 



filing d 
(day/month/year) 



18 April 2002 fl8.04.2002) 



A^dicant 

EOS BIOTECHNOLOGY, INC 



Tin ^plicant is enfitied, if be so wishes, to amend the claims of the interastiaiial opplicadon (see Role 46): 
Wheal The time limit Ibr filing such amendments is aoimiaUy two mfmtha fy>ai lbs dal 



Whore? IMnctly todielbtBaiiidoiul Buteouof WEPO, 34, c 

1211 Geneva 20. Switzedand. FacsfaidlB No.: (41-22) 740.I4.3S 
For more detailed instructions, see the notes on the accompanylQg ehcet. 

Hid appUcaat is hsieby mtified that no inteaiatianal sean^ t^it MiiSi be escabliahed and that the declaradon under 
Article 17(2}(a) to Oat effect is ttansnutted herewith. 

I 1 Wiili regard to the protest ^gmnst payiaexi of (an) additional fee(s) uoder Rule 40.2, the applicant is notified Oiat: 

I I the protest together whh the dedsioa thereon has been transmitted to the International Bureau together with the 

ap^icanfsnqiiea to fiwwaid the texts ofboih the protest and die decision flteteonip the designated OCaces. 
i I nodedslonhas beenmadeyetmtbepnatest; tfaeappUiantwiUbeiKitifiedassoonesadeidsionismada 



B fiom the luiuiUy date, the ii 



ification will be piAUshed by the Ihtematlonal Bureau. It the 



afiplicant wisheano avoidor postpone pubfication, a notice of vrithdrawal of die international application, or of the pnorit^ (daim. n 

Hides 90 bis.l and 90 Mr.3, respectively, before the oon^ledon of the technical 



Within 19 ntonllis fiom die prioiiQf date, bat oidy in lespect of some designated Offices, a demand for international prdlmiiiaiy 
examinati on must be filed if tlie applicant wishes to postpone die entiy into die national phase nntn 30 months firom the priority date 
(in some Offices even later); otherwise the applicant must, mthln 20 monOs fiom the prioiiQr dote, petfomi die presciibed acts for 
entr7 ioto the national phase faeCbre those desigasted Offices. 

select of other dedgnated Offioes, the tia» Unut of 30 montlffi (or later) will a^pfy even if no demand is filed widun 19 mouiha. 
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